REVIEW | doi:10.20944/preprints202212.0191.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: machine learning; deep learning; generative models
Online: 12 December 2022 (04:05:39 CET)
Over the past decade, research in the field of Deep Learning has brought about novel improvements in image generation and feature learning; one such example being a Generative Adversarial Network. However, these improvements have been coupled with an increasing demand on mathematical literacy and previous knowledge in the field. Therefore, in this literature review, I seek to introduce Generative Adversarial Networks (GANs) to a broader audience by explaining their background and intuition at a more foundational level. I begin by discussing the mathematical background of this architecture, specifically topics in linear algebra and probability theory. I then proceed to introduce GANs in a more theoretical framework, along with some of the literature on GANs, including their architectural improvements and image-generation capabilities. Finally, I cover state-of-the-art image generation through style-based methods, as well as their implications on society.
ARTICLE | doi:10.20944/preprints202306.1492.v1
Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: generative adversarial networks; downscaling; seasonal forecasts; ERA5-Land; EU-DEM
Online: 21 June 2023 (07:31:31 CEST)
Accurate seasonal weather forecasts are vital across a broad spectrum of applications, bearing significant environmental and socioeconomic implications. This importance renders the subject a matter of primary interest to a wide range of stakeholders, including the general public, agricultural sector, emergency responders, financial institutions, and policy strategists. The need for precision in long-term predictive models necessitates the development of innovative methodologies. These methodologies should be capable of deciphering atmospheric patterns and mechanisms with detail, especially at a local level, without the resource constraints that dynamical downscaling imposes. In response to this expanding demand, this study presents a novel solution, combining a stochastic deep learning methodology, specifically Generative Adversarial Networks (GANs), with a Digital Elevation Model (DEM). The cornerstone of the proposed model is the enhancement of gridded temperature fields from seasonal forecasts. The area of focus was the Hellenic region, wherein the spatial resolution is amplified from a coarse 1° x 1° grid to an impressively detailed 0.1° x 0.1° grid. This offers a transformative perspective for interpreting and employing this crucial meteorological data. The results suggest that the downscaled fields adequately approximate the actual spatial distribution, although the predicted values tend to slightly overestimate and in some cases underestimate the original ones. This study underlines the potential of this approach to significantly enhance the resolution and utility of weather forecasts, thereby contributing to a variety of sectors dependent on reliable meteorological data.
ARTICLE | doi:10.20944/preprints202208.0192.v1
Subject: Engineering, Automotive Engineering Keywords: Transfer Learning; Generative Adversarial Networks; MRI Brain Images
Online: 10 August 2022 (05:04:02 CEST)
Segmentation is an important step in medical imaging. In particular, machine learning, especially deep learning, has been widely used to efficiently improve and speed up the segmentation process in clinical practice. Despite the acceptable segmentation results of multi-stage models, little attention was paid to the use of deep learning algorithms for brain image segmentation, which could be due to the lack of training data. Therefore, in this paper, we propose a Generative Adversarial Network (GAN) model that performs transfer learning to segment MRI brain images.Our model enables the generation of more labeled brain images from existing labeled and unlabeled images. Our segmentation targets brain tissue images, including white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF). We evaluate the performance of our GAN model using a commonly used evaluation metric, which is Dice Coefficient (DC). Our experimental results reveal that our proposed model significantly improves segmentation results compared to the standard GAN model. We observe that our model is 2.1–10.83 minutes faster than stat-of-the-art-models.
ARTICLE | doi:10.20944/preprints202104.0651.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: speech processing, data augmentation, speech emotion recognition, generative adversarial net-works
Online: 26 April 2021 (10:49:55 CEST)
Nowadays, and with the mechanization of life, speech processing has become so crucial for the interaction between humans and machines. Deep neural networks require a database with enough data for training. The more features are extracted from the speech signal, the more samples are needed to train these networks. Adequate training of these networks can be ensured when there is access to sufficient and varied data in each class. If there is not enough data; it is possible to use data augmentation methods to obtain a database with enough samples. One of the obstacles to developing speech emotion recognition systems is the Data sparsity problem in each class for neural network training. The current study has focused on making a cycle generative adversarial network for data augmentation in a system for speech emotion recognition. For each of the five emotions employed, an adversarial generating network is designed to generate data that is very similar to the main data in that class, as well as differentiate the emotions of the other classes. These networks are taught in an adversarial way to produce feature vectors like each class in the space of the main feature, and then they add to the training sets existing in the database to train the classifier network. Instead of using the common cross-entropy error to train generative adversarial networks and to remove the vanishing gradient problem, Wasserstein Divergence has been used to produce high-quality artificial samples. The suggested network has been tested to be applied for speech emotion recognition using EMODB as training, testing, and evaluating sets, and the quality of artificial data evaluated using two Support Vector Machine (SVM) and Deep Neural Network (DNN) classifiers. Moreover, it has been revealed that extracting and reproducing high-level features from acoustic features, speech emotion recognition with separating five primary emotions has been done with acceptable accuracy.
ARTICLE | doi:10.20944/preprints202311.1447.v1
Subject: Engineering, Mechanical Engineering Keywords: Mask R-CNN; generative adversarial network; defect detection
Online: 23 November 2023 (04:57:34 CET)
When applying deep learning methods to detect micro defects on low-contrast LCD surfaces, there are challenges related to the imbalance in samples dataset, as well as the complexity and laboriousness of annotating and acquiring target image masks. In order to solve these problems, a method based on sample and mask auto-generation for deep generative network models is proposed. We first generate an augmented dataset of negative samples using a generative adversarial network(GAN), and then highlight the defect regions in these samples using the training method constructed by the GAN to generate masks for the defect images automatically. Experimental results shows the effectiveness of our proposed method, as it allows for the simultaneous generation of LCD image samples and their corresponding image masks. Through a comparative experiment on the deep learning method Mask R-CNN, we demonstrate that the automatically obtained image masks have high detection accuracy.
ARTICLE | doi:10.20944/preprints201906.0104.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep Learning, Generative Adversarial Networks (GANs), Machine Learning, Autoencoders, Voice Conversion, Ethics, CycleGANs
Online: 12 June 2019 (11:17:52 CEST)
The upsurge of Generative Adversarial Networks (GANs) in the previous five years has led to advancements in unsupervised data manipulation, sourced feature translation, and precise input-output synthesis through a competitive optimization of the discriminator and generator networks. More specifically, the recent rise of cycle-consistent GANs enables style transfers from a discrete source (input A) to target domain (input B) by preprocessing object features for a multi-discriminative adversarial network. Traditionally, cyclical adversarial networks have been exploited for unpaired image-to-image translation and domain adaptation by determining mapped relationships between an input A graphic and an input B graphic. However, this integral mechanism of domain adaptation can be applied to the complex acoustical features of human speech. Although well-established datasets, such as the 2018 Voice Conversion Challenge repository, paved way for female-male voice transformation, cycle-GANs have rarely been re-engineered for voices outside the datasets. More critically, cycle-GANs have massive potential to extract surface-level and hidden feature to distort an input A source into a texturally unrelated target voice. By preprocessing, compressing, and packaging unique acoustical voice properties, CycleGANs can learn to decompose speech signals and implement new translation models while preserving emotion, the intent of words, rhythm, and accents. Due to the potential of CycleGAN’s autoencoder in realistic unsupervised voice-voice conversion/feature adaptation, the researchers raise the ethical implications of controlling source input A to manipulate target voice B, particularly in cases of defamation and sabotage of target B’s words. This paper analyzes the potential of cycle-consistent GANs in deceptive voice-voice conversion by manipulating interview excerpts of political candidates.
ARTICLE | doi:10.20944/preprints202308.0131.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: deep generative model (DGM); Variational Autoencoders (VAE); Generative Adversarial Network (GAN)
Online: 2 August 2023 (03:39:21 CEST)
Generative artificial intelligence (GenAI) has been developing with many incredible achievements like ChatGPT and Bard. Deep generative model (DGM) is a branch of GenAI, which is preeminent in generating raster data such as image and sound due to strong points of deep neural network (DNN) in inference and recognition. The built-in inference mechanism of DNN, which simulates and aims to synaptic plasticity of human neuron network, fosters generation ability of DGM which produces surprised results with support of statistical flexibility. Two popular approaches in DGM are Variational Autoencoders (VAE) and Generative Adversarial Network (GAN). Both VAE and GAN have their own strong points although they share and imply underline theory of statistics as well as incredible complex via hidden layers of DNN when DNN becomes effective encoding/decoding functions without concrete specifications. In this research, I try to unify VAE and GAN into a consistent and consolidated model called Adversarial Variational Autoencoders (AVA) in which VAE and GAN complement each other, for instance, VAE is good at generator by encoding data via excellent ideology of Kullback-Leibler divergence and GAN is a significantly important method to assess reliability of data which is realistic or fake. In other words, AVA aims to improve accuracy of generative models, besides AVA extends function of simple generative models. In methodology this research focuses on combination of applied mathematical concepts and skillful techniques of computer programming in order to implement and solve complicated problems as simply as possible.
ARTICLE | doi:10.20944/preprints202310.1143.v1
Subject: Medicine And Pharmacology, Orthopedics And Sports Medicine Keywords: Computed Tomography; Generative Adversarial Networks; Deep learning; 3D reconstruction; Spinal Imaging; Spinal diagnosis; Spine surgery; Quantitative measurement; Clinical application
Online: 18 October 2023 (12:03:13 CEST)
Computed tomography (CT) offers detailed insights into the internal anatomy of patients, partic-ularly for spinal vertebrae examination. However, CT scans are associated with higher radiation exposure and cost compared to conventional X-ray imaging. In this study, we applied a Genera-tive Adversarial Network (GAN) framework to reconstruct 3D spinal vertebrae structures from synthetic biplanar X-ray images, specifically focusing on anterior and lateral views. The synthetic X-ray images were generated using the DRRGenerator module in 3D Slicer, by incorporating segmentations of spinal vertebrae in CT scans for the region of interest focussing. The approach leverages a novel feature fusion technique based on X2CT-GAN to combine information from both views and employs a combination of Mean Squared Error (MSE) loss and adversarial loss to train the generator, resulting in high-quality synthetic 3D spinal vertebrae CTs. A total of n=440 CT data were processed. We evaluated the performance of our model using multiple metrics, in-cluding Mean Absolute Error (MAE) (for each slice of the 3D volume [MAE0] and for the entire 3D volume [MAE]), Cosine Similarity, Peak Signal-to-Noise Ratio (PSNR), 3D Peak Sig-nal-to-Noise Ratio (PSNR-3D), and Structural Similarity Index (SSIM). The average PSNR was 28.394 dB, PSNR-3D was 27.432, SSIM was 0.468, cosine similarity was 0.484, MAE0 was 0.034, and MAE was 85.359. The results demonstrated the effectiveness of the approach in reconstruct-ing 3D spinal vertebrae structures from biplanar X-rays, although some limitations in accurately capturing the fine bone structures and maintaining the precise morphology of the vertebrae were present. This technique has the potential to enhance the diagnostic capabilities of low-cost X-ray machines while reducing radiation exposure and cost associated with CT scans, paving the way for future applications in spinal imaging and diagnosis.
ARTICLE | doi:10.20944/preprints202107.0636.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Generative Adversarial Networks; Transfer Learning; Medical Imaging; Deep Learning Classification; Chest X-ray’s
Online: 28 July 2021 (17:12:31 CEST)
Data sets for medical images are generally imbalanced and limited in sample size because of high data collection costs, time-consuming annotations, and patient privacy concerns. The training of deep neural network classification models on these data sets to improve the generalization ability does not produce the desired results for classifying the medical condition accurately and often overfit the data on the majority of class samples. To address the issue, we propose a framework for improving the classification performance metrics of deep neural network classification models using transfer learning: pre-trained models, such as Xception, InceptionResNet, DenseNet along with the Generative Adversarial Network (GAN) – based data augmentation. Then, we trained the network by combining traditional data augmentation techniques, such as randomly flipping the image left to right and GAN-based data augmentation, and then fine-tuned the hyper-parameters of the transfer learning models, such as the learning rate, batch size, and the number of epochs. With these configurations, the Xception model outperformed all other pre-trained models achieving a test accuracy of 98.7%, the precision of 99%, recall of 99.3%, f1-score of 99.1%, receiver operating characteristic (ROC) - area under the curve (AUC) of 98.2%.
ARTICLE | doi:10.20944/preprints202310.1857.v1
Subject: Engineering, Architecture, Building And Construction Keywords: Interior renovation; Mixed Reality (MR); Diminished Reality (DR); Biophilic design; generative adversarial networks (GANs)
Online: 30 October 2023 (08:25:19 CET)
In contemporary society, an increasing number of individuals are identified as the "Indoor Generation". Prolonged periods spent indoors can have detrimental effects on our overall well-being. Consequently, the examination of biophilic indoor environments and their impact on occupant well-being and comfort has emerged as a significant area of study. Besides the construction of new biophilic buildings, numerous existing building stocks retain substantial social, economic, and environmental value. The need for stock renovation is considerable, revitalizing existing structures rather than embarking on new constructions. Swift feedback is crucial during the initial stages of renovation design, fostering consensus among all stakeholders. A comprehensive understanding of proposed renovation plans contributes to improved indoor environment design. Initially, generative adversarial networks (GANs) produce interior renovation drawings based on user preferences and designer styles. Subsequently, the implementation of Mixed Reality (MR) and Diminished Reality (DR) provides users with an immersive perspective, transforming 2D drawings into interactive MR experiences, and facilitating the presentation of interior renovation proposals. This paper outlines the development of a real-time system for architectural designers that seamlessly integrates MR, DR, and GANs results, with the aim of enhancing feedback efficiency during the renovation design process, enabling stakeholders to evaluate and understand renovation proposals more comprehensively. Furthermore, we incorporate several full-reference image quality assessment (FR-IQA) methods to evaluate the quality of the GANs-generated images. The evaluation results indicate that the majority of images fall within the moderate range of quality.
ARTICLE | doi:10.20944/preprints202202.0159.v1
Subject: Computer Science And Mathematics, Applied Mathematics Keywords: hybrid degradation; generative adversarial network; attention mechanism; disentangled representation
Online: 11 February 2022 (08:42:43 CET)
In this paper, we propose an unsupervised blind restoration model for images in hybrid degradation scenes. The proposed model encodes the content information and degradation information of images and then uses the attention module to disentangle the two kinds of information. It can improve the ability of disentangled presentation learning for a generative adversarial network (GAN) to restore the images in hybrid degradation scenes, enhance the detailed features of restored image and remove the artifact combining the adversarial loss, cycle-consistency loss, and perception loss. The experimental results on the DIV2K dataset and medical images show that the proposed method outperforms existing unsupervised image restoration algorithms in terms of peak signal-to-noise ratio (PSNR), structural similarity (SSIM), and subjective visual evaluation.
ARTICLE | doi:10.20944/preprints202101.0519.v1
Subject: Engineering, Automotive Engineering Keywords: machine learning; additive manufacturing; conditional generative adversarial network; in-situ monitoring
Online: 25 January 2021 (15:55:32 CET)
Conditional generative adversarial networks (CGANs) learn a mapping from conditional input to observed image and perform tasks in image generation, manipulation and translation. In-situ monitoring uses sensors to obtain real-time information of additive manufacturing (AM) processes that relate to process stability and part quality. Understanding the correlations between process inputs and in-situ process signatures through machine learning can enable experimental-driven predictions of future process inputs. In this research, in-situ data obtained during a metallic powder bed fusion AM process is mapped with a CGAN. A single build of two turbine blades is monitored using EOSTATE Exposure OT, a near-infrared optical tomography system of the EOS M290 system. Layerwise images generated from the in-situ monitoring system were paired with a conditional image that labeled the specimen cross-section, laser-scan stripe overlap and z-distance to part surfaces. A CGAN was trained using the turbine blade data set and employed to generate new in-situ layerwise images for unseen conditional inputs.
ARTICLE | doi:10.20944/preprints201811.0400.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Super-Resolution; Deep-learning; Generative Adversarial Networks; CMOS sensors
Online: 16 November 2018 (10:44:26 CET)
Complementary Metal-Oxide-Semiconductor (CMOS) is a typical image sensor that has a wide range of applications. However, considering the limitations of the weather condition and hardware cost, it is hard to capture high-resolution images by CMOS sensor. Recently, Super-Resolution (SR) techniques for image restoration has been gaining attentions due to its excellent performance. Under the powerful learning ability, Generative Adversarial Networks (GANs) have been proved to achieve great success. In this paper, we propose the Advanced Generative Adversarial Networks (AGAN) to efficiently correct these issues; 1) we design a Laplacian pyramid framework as pre-trained module, which is beneficial to provide multi-scale features for our input. 2) at each feature block, a convolutional skip-connections network, which may contain some latent information, is significant for generative model to reconstruct a plausible-looking image; 3) considering that edge details usually play an important role in image generation, a novel perceptual loss function is defined to train and seek optimal parameters. It is effective to achieve excellent and compelling quality captured by CMOS sensor. Quantitative and qualitative evaluations have been demonstrated that our algorithm not only fully takes advantage of Convolutional Neural Networks (CNNs) to improve the image quality, but also performs better than previous GAN algorithms for super-resolution task.
ARTICLE | doi:10.20944/preprints202001.0028.v2
Subject: Computer Science And Mathematics, Information Systems Keywords: anomaly detection; generative adversarial network; multiple hypothesis; particulate matter
Online: 14 February 2020 (05:23:15 CET)
World Health Organization (WHO) provides the guideline for managing the Particulate Matter (PM) level because when the PM level is higher, it threats the human health. For managing PM level, the procedure for measuring PM value is needed firstly. We use Tapered Element Oscillating Microbalance (TEOM)-based PM measuring sensors because it shows higher cost-effectiveness than Beta Attenuation Monitor (BAM)-based sensor. However, TEOM-based sensor has higher probability of malfunctioning than BAM-based sensor. In this paper, we call the overall malfunction as an anomaly, and we aim to detect anomalies for the maintenance of PM measuring sensors. We propose a novel architecture for solving the above aim that named as Hypothesis Pruning Generative Adversarial Network (HP-GAN). We experimentally compare the several anomaly detection architectures to certify ours performing better.
ARTICLE | doi:10.20944/preprints201810.0756.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Dermoscopic Image, Skin Lesion, Melanoma, Simulation, Generative Adversarial Networks, Deep Learning
Online: 1 November 2018 (17:54:12 CET)
Automated skin lesion analysis is one of the trending fields that has gained attention among the dermatologists and healthcare practitioners. Skin lesion restoration is an essential preprocessing step for lesion enhancements for accurate automated analysis and diagnosis. Digital hair removal is a non-invasive method for image enhancement by solving the hair-occlusion artefact in previously captured images. Several hair removal methods were proposed for skin delineation and removal. However, manual annotation is one of the main challenges that hinder the validation of these proposed methods on a large number of images or using benchmarking datasets for comparison purposes. In the presented work, we propose a realistic hair simulator based on context-aware image synthesis using image-to-image translation techniques via conditional adversarial generative networks for generation of different hair occlusions in skin images, along with the ground-truth mask for hair location. Besides, we explored using three loss functions including L1-norm, L2-norm and structural similarity index (SSIM) to maximise the synthesis quality. For quantitatively evaluate the realism of image synthesis, the t-SNE feature mapping and Bland-Altman test are employed as objective metrics. Experimental results show the superior performance of our proposed method compared to previous methods for hair synthesis with plausible colours and preserving the integrity of the lesion texture.
ARTICLE | doi:10.20944/preprints202304.0086.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Image fusion; generative adversarial network (GAN); local binary patterns (LBP); multi-modal images
Online: 6 April 2023 (10:03:31 CEST)
Image fusion is the process of combining multiple input images from single or multiple imaging modalities into a fused image, which is expected to be more informative for human or machine perception as compared to any of the input images. In this paper, we propose a novel method based on deep learning for fusing infrared images and visible images, named the LBP-based proportional input generative adversarial network (LPGAN). In the image fusion task, the preservation of structural similarity and image gradient information is contradictory, and it is difficult for both to achieve good performance at the same time. To solve this problem, we innovatively introduce Local Binary Patterns (LBP) into Generative Adversarial Networks (GANs), which effectively utilize the texture features of the source images, so that the network has stronger feature extraction ability and anti-interference ability. In the feature extraction stage, we introduce a pseudo-siamese network for the generator to extract the detailed features and the contrast features. At the same time, considering the characteristic distribution of different modal images, we propose a 1:4 scale input mode. Extensive experiments on the publicly available TNO dataset and CVC14 dataset show that the proposed method achieves the state-of-the-art performance. We also test the universality of LPGAN through the fusion of RGB and infrared images on the RoadScene dataset. In addition, LPGAN is applied to multi-spectral remote sensing image fusion. Both qualitative and quantitative experiments demonstrate that our LPGAN can not only achieve good structural similarity, but also retain rich detailed information.
ARTICLE | doi:10.20944/preprints202305.2218.v1
Subject: Computer Science And Mathematics, Security Systems Keywords: Generative Adversarial Network; Intrusion Detection System; Imbalanced Dataset; Machine Learning; Unsupervised Learning
Online: 31 May 2023 (10:22:58 CEST)
The IDS serves as a security system that maintains constant surveillance over network traffic and host systems in order to identify any security breaches or potentially concerning activities. Recently. the rise in cyber-attacks has driven the necessity for the development of automated and intelligent network intrusion detection systems. These systems are designed to learn the typical patterns of network traffic, allowing them to identify any deviations from normal behaviour, which can be classified as anomalous or malicious. Machine learning methods are widely used to exhibit a satisfactory effectiveness in detecting malicious payloads in the network traffic. While the volume of the data generated from IDS is increasing exponentially results in the emergence of substantial security risks, it highlighted the imperative to strengthen network security. The performance of traditional machine learning methods depends on the dataset and the data balance distribution in it. while most of IDS datasets suffer from unbalancing, this limits the performance of the machine learning method used in the system and results in missed detections and false alarms in the conventional IDSs. To address this issue, this paper presents a new model-based Generative Adversarial Network (GAN) called TDCGAN to enhance the detection rate of less of minor class in the dataset while maintaining efficiency. The proposed model consists of one generator and three discriminators with an election layer at the end of architecture. The UGR’16 data set is used for evaluation purposes. In order to demonstrate the efficacy of our proposed model, various machine learning algorithms have been utilized for comparison. The experimental findings have determined that TDCGAN presents an efficient resolution for addressing imbalanced intrusion detection and surpasses the performance of other oversampling machine learning methods.
ARTICLE | doi:10.20944/preprints202202.0054.v2
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: generative adversarial networks; NDVI; green areas; orthophoto; artificial datasets.
Online: 19 April 2022 (10:11:20 CEST)
Generative adversarial networks (GAN) opened new possibilities for image processing and analysis. Inpainting, dataset augmentation using artificial samples or increasing spatial resolution of aerial imagery are only a few notable examples of utilizing GANs in remote sensing. This is due to a unique construction and training process expressed as a duel between GAN components. The main objective of the research is to apply GAN to generate an artificial Normalized Difference Vegetation Index (NDVI) using panchromatic images. The NDVI ground-truth labels were prepared by combining RGB and NIR orthophoto. The dataset was then utilized as input for a conditional generative adversarial network (cGAN) to perform an image-to-image translation. The main goal of the neural network was to generate an artificial NDVI image for each processed 256px × 256px patch using only information available in the panchromatic input. The network achieved 0.7569 ± 0.1083 Structural Similarity Index Measure (SSIM), 26.6459 ± 3.6577 Peak Signal-to-Noise Ratio (PNSR) and 0.0504 ± 0.0193 Root-Mean-Square Error (RSME) on the test set. The perceptual evaluation was performed to verify the usability of the method when working with a real-life scenario. The research confirms that the structure and texture of the panchromatic aerial remote sensing image contains sufficient information for NDVI estimation for various objects of urban space. Even though these results can be used to highlight areas rich in vegetation and distinguish them from urban background, there is still room for improvement in terms of accuracy of estimated values. The purpose of the research is to explore the possibility of utilizing GAN to enhance panchromatic images (PAN) with information related to vegetation. This opens interesting possibilities in terms of historical remote sensing imagery processing and analysis. The panchromatic orthoimagery dataset was derived from RGB orthoimagery.
ARTICLE | doi:10.20944/preprints202207.0356.v1
Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: Multi-site statistical downscaling; Generative Adversarial Network; Combination of Errors; Convolutional Neural Network; Struc-tural Similarity Index; Wasserstein GAN; extreme precipitation
Online: 25 July 2022 (07:59:59 CEST)
Although the statistical methods of downscaling climate data have progressed significantly, the development of high-resolution precipitation products continues to be a challenge. This is especially true when interest centres on downscaling value over several study sites. In this paper , we report a new downscaling method termed the multi-site Climate Generative Adversarial Network (MSCliGAN), which can simulate annual maximum precipitation to the regional scale during the 1950-2010 period in different cities in Canada by using different AOGCM's from the Coupled Model Inter-Comparison Project 6 (CMIP6) as input. Auxiliary information provided to the downscaling model included topography and land-cover. The downscaling framework uses a convolution encoder-decoder U-net network to create a generative network and a convolution encoder network to create a critic network. An adversarial training strategy is used to train the model. The critic/discriminator used Wasserstein distance as a loss measure and on the other hand the generator is optimized using a summation of content loss on Nash-Shutcliff Model Efficiency (NS), structural loss on structural similarity index (SSIM), and adversarial loss Wasserstein distance. Downscaling results show that downscaling AOGCMs by incorporating topography and land-use/land-cover can produce spatially coherent fields close to observation over multiple-sites. We believe the model has sufficient downscaling potential in data sparse regions where climate change information is often urgently needed.
ARTICLE | doi:10.20944/preprints202011.0527.v1
Subject: Engineering, Aerospace Engineering Keywords: Aircraft Maintenance Inspection; Anomaly Detection; Defect Inspection; Convolutional Neural Networks; Mask R-CNN; Generative Adversarial Networks; Image Augmentation
Online: 20 November 2020 (09:16:13 CET)
Convolutional Neural Networks combined with autonomous drones are increasingly seen as enablers of partially automating the aircraft maintenance visual inspection process. Such an innovative concept can have a significant impact on aircraft operations. Through supporting aircraft maintenance engineers detect and classify a wide range of defects, the time spent on inspection can significantly be reduced. Examples of defects that can be automatically detected include aircraft dents, paint defects, cracks and holes, and lightning strike damage. Additionally, this concept could also increase the accuracy of damage detection and reduce the number of aircraft inspection incidents related to human factors like fatigue and time pressure. In our previous work, we have applied a recent Convolutional Neural Network architecture known by MASK R-CNN to detect aircraft dents. MASK-RCNN was chosen because it enables the detection of multiple objects in an image while simultaneously generating a segmentation mask for each instance. The previously obtained F1 and F2 scores were 62.67% and 59.35% respectively. This paper extends the previous work by applying different techniques to improve and evaluate prediction performance experimentally. The approaches uses include (1) Balancing the original dataset by adding images without dents; (2) Increasing data homogeneity by focusing on wing images only; (3) Exploring the potential of three augmentation techniques in improving model performance namely flipping, rotating, and blurring; and (4) using a pre-classifier in combination with MASK R-CNN. The results show that a hybrid approache combining MASK R-CNN and augmentation techniques leads to an improved performance with an F1 score of (67.50%) and F2 score of (66.37%)
ARTICLE | doi:10.20944/preprints202309.2100.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: quantum algorithms; quantum machine learning; generative ai; quantum generative ai; generative learning; diffusion ai
Online: 29 September 2023 (14:02:07 CEST)
Image synthesis poses a challenging problem that researchers in computer vision and machine learning have been grappling with for several decades. Numerous machine learning techniques have emerged and proven effective in generating high-fidelity artificial images. This study breaks new ground by exploring image synthesis through generative learning using the D-Wave 2000Q quantum annealer, marking the first attempt to address the issue of generative image synthesis on a Quantum Processing Unit (QPU). Alongside executing image synthesis on the quantum annealer, this research also compares its performance with existing classical models and delves into resolving the Generative Learning Trilemma.
ARTICLE | doi:10.20944/preprints202212.0570.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Drone and Aerial Remote Sensing; Image Deblurring; Generative Adversarial Networks; Multi-Scale; Image blur level; Object Detection; Deep Learning
Online: 30 December 2022 (04:45:12 CET)
Drone and aerial remote sensing images are widely used, but their imaging environment is complex and prone to image blurring. Existing CNN deblurring algorithms usually use multi-scale fusion to extract features in order to make full use of aerial remote sensing blurred image information, but images with different degrees of blurring use the same weights, leading to increasing errors in the feature fusion process layer by layer. Based on the physical properties of image blurring, this paper proposes an adaptive multi-scale fusion blind deblurred generative adversarial network (AMD-GAN), which innovatively applies the degree of image blurring to guide the adjustment of the weights of multi-scale fusion, effectively suppressing the errors in the multi-scale fusion process and enhancing the interpretability of the feature layer. The research work in this paper reveals the necessity and effectiveness of a priori information on image blurring levels in image deblurring tasks. By studying and exploring the image blurring levels, the network model focuses more on the basic physical features of image blurring. Meanwhile, this paper proposes an image blurring degree description model, which can effectively represent the blurring degree of aerial remote sensing images. The comparison experiments show that the algorithm in this paper can effectively recover images with different degrees of blur, obtain high-quality images with clear texture details, outperform the comparison algorithm in both qualitative and quantitative evaluation, and can effectively improve the object detection performance of aerial remote sensing blurred images. Moreover, the average PSNR of this paper's algorithm tested on the publicly available dataset RealBlur-R reached 41.02dB, surpassing the latest SOTA algorithm.
ARTICLE | doi:10.20944/preprints202011.0039.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: generative model; human movement; conditional Deep Convolutional Generative Adversarial Network; GAN; spatio-temporal pseudo-image
Online: 2 November 2020 (12:55:22 CET)
Generative models for images, audio, text and other low-dimension data have achieved great success in recent years. Generating artificial human movements can also be useful for many applications, including improvement of data augmentation methods for human gesture recognition. The object of this research is to develop a generative model for skeletal human movement, allowing to control the action type of generated motion while keeping the authenticity of the result and the natural style variability of gesture execution. We propose to use a conditional Deep Convolutional Generative Adversarial Network (DC-GAN) applied to pseudo-images representing skeletal pose sequences using Tree Structure Skeleton Image format. We evaluate our approach on the 3D-skeleton data provided in the large NTU RGB+D public dataset. Our generative model can output qualitatively correct skeletal human movements for any of its 60 action classes. We also quantitatively evaluate the performance of our model by computing Frechet Inception Distances, which shows strong correlation to human judgement. Up to our knowledge, our work is the first successful class-conditioned generative model for human skeletal motions based on pseudo-image representation of skeletal pose sequences.
CONCEPT PAPER | doi:10.20944/preprints202203.0014.v1
Subject: Biology And Life Sciences, Anatomy And Physiology Keywords: body plan; archetype; burden; generative entrenchment
Online: 1 March 2022 (10:17:43 CET)
A body plan is a stable configuration of characters for a major taxonomic group, such as chordates or arthropods. Despite widespread casual reliance on the concept for guiding comparisons within and between groups, the nature of body plans as well as the biological causes underlying their evolution have remained elusive. This paper proposes an abstract mechanistic model of body plan identity. We hypothesize that body plans are an evolutionary phenomenon that only applies to a relatively small subset of major clades, rather than being associated with each and every so-called “phylum.” Body plans arise in evolution by stepwise accretion, and require a level of developmental complexity that is only found in some animal clades. Further, we suggest that, parallel to the developmental mechanisms controlling character identity, there are “body plan identity mechanisms” (BpIMs) that maintain entire configurations of characters while possessing a mechanistic architecture that is itself stable and traceable through evolutionary change. These BpIMs, we suggest, are entrenched intercellular signaling networks operating between transient embryonic structures that are destined to differentiate into distinct individualized characters. The activity of a BpIM results in a transient long-range integration of the embryo that is highly sensitive to genetic and environmental perturbations, and that can be detected morphologically as a conserved phylotypic stage. This model is illustrated with detailed interpretations of the notochord signaling system and the segment polarity network as candidate BpIMs in vertebrates and arthropods, respectively. We conclude by contrasting the proposed developmental-mechanistic conception of body plans with the phylogenetic notion of ground plans, and sketch the general outlines of an empirical research program on body plan evolution.
ARTICLE | doi:10.20944/preprints202103.0692.v1
Subject: Engineering, Civil Engineering Keywords: Genetic algorithms, structures, algorithms, generative design
Online: 29 March 2021 (12:50:16 CEST)
The prevalence of algorithms and computational tools in the modern-day has intersected with nearly every field. Generative design, specifically those using genetic algorithms, is an increasingly effective, yet cost efficient way to generate architectural designs in modern engineering. Thus, we adopt a genetic algorithm model in pursuit of maximizing the durability of a structure when it is stressed while minimizing the material cost. After the model is formulated, the algorithm is able to approximate with high accuracy the load a small-scale structure is able to bear, as well as iterate upon its designs to maximize a fitness function.
ARTICLE | doi:10.20944/preprints202103.0651.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Abstraction; Complexity Management; Patterns; Generative Patterns
Online: 25 March 2021 (17:29:09 CET)
According to many researchers, Abstraction is the basis of mathematics, computing, counting devices, and computer science and engineering. What is more, all of the above deal with complexity management in some way, and abstraction is the most basic mechanism of complexity management.Generative software development - whether in the sense of empowering humans by machine to create software or in the sense of reusing products - has been and is one of the serious concerns and goals of software engineering. The interesting thing is that in both views of generativity, the main issue is still, in a way, complexity management: whether this complexity management is to achieve diversity and reuse management (Czarnecki’s approach) or to Structuring from existing structures (the approach of Alexander and his followers in an object-oriented society).In this article, we will first look at complexity and its various definitions. The definitions that show, despite the different perspectives on complexity in different disciplines and domains, all point in one direction. We will conclude that complexity is rooted in multiplicity. In the following, we will formally define complexity. In the following discussion, we will look at the generative patterns of software development, and then we will look at the complexity management patterns at seven levels.In this article, the author has tried to maintain a comprehensive approach to complexity and to consider the approaches of different domains to complexity.
ARTICLE | doi:10.20944/preprints202211.0040.v1
Subject: Engineering, Telecommunications Keywords: EMF exposure; conditional generative adversarial network; optimization
Online: 2 November 2022 (03:43:14 CET)
With the ongoing fifth-generation cellular network (5G) deployment, electromagnetic field exposure has become a critical concern. However, measurements are scarce, and accurate electromagnetic field reconstruction in a geographic region remains challenging. This work proposes a conditional Generative Adversarial Network to address this issue. The main objective is to reconstruct the electromagnetic field exposure map accurately according to the environment’s topology from a few sensors located in an outdoor urban environment. The model is trained to learn and estimate the propagation characteristics of the electromagnetic field according to the topology of a given environment. In addition, the conditional Generative Adversarial Network based electromagnetic field mapping is compared with simple kriging. Results show that the proposed method produces accurate estimates and is a promising solution for exposure map reconstruction.
ARTICLE | doi:10.20944/preprints202205.0398.v1
Subject: Computer Science And Mathematics, Information Systems Keywords: Generative Software Development; Code Generation; Complexty Space
Online: 30 May 2022 (11:32:07 CEST)
This survey proposed an evaluation model to analyze and examine different approaches to generativity. In addition to problem domain concepts, the following concepts were used to define this model: Complexity and complexity management, and Systematics view to achieve unified and integrated insight into disparate evaluation criteria. The research's approach to the said concepts is first introduced. Then, the evaluation model is presented.
ARTICLE | doi:10.20944/preprints202104.0556.v1
Subject: Engineering, Automotive Engineering Keywords: super-resolution; generative adversarial network; Sentinel-2
Online: 21 April 2021 (08:25:54 CEST)
Sentinel-2 can provide multi-spectral optical remote sensing images in RGBN bands with a spatial resolution of 10m, but the spatial details provided are not enough for many applications. WorldView can provide HR multi-spectral images less than 2m, but it is a commercial paid resource with relatively high usage costs. In this paper, without any available reference images, Sentinel-2 images at 10m resolution are improved to a resolution of 2.5m through super-resolution (SR) based on deep learning technology. Our model, named DKN-SR-GAN, uses degradation kernel estimation and noise injection to construct a dataset of near-natural low-high-resolution (LHR) image pairs, with only low-resolution (LR) images and no high-resolution (HR) prior information. DKN-SR-GAN uses the Generative Adversarial Networks (GAN) combined of ESRGAN-type generator, PatchGAN-type discriminator and the VGG-19-type feature extractor, using perceptual loss to optimize the network, so as to obtain SR images with clearer details and better perceptual effects. Experiments demonstrate that in the quantitative comparison of the non-reference image quality assessment (NR-IQA) metrics like NIQE, BRISQUE and PIQE, as well as the intuitive visual effects of the generated images, compared with state-of-the-art models such as EDSR8-RGB, RCAN and RS-ESRGAN, our proposed model has obvious advantages.
ARTICLE | doi:10.20944/preprints202309.1768.v2
Subject: Computer Science And Mathematics, Security Systems Keywords: Generative Pre-training Transformer; ChatGPT; cyberattacks; ChatGPT cybersecurity
Online: 8 November 2023 (16:16:14 CET)
The Chat Generative Pre-training Transformer (GPT), also known as ChatGPT, is a powerful generative AI model that can simulate human-like dialogues across a variety of domains. However, this popularity has attracted the attention of malicious actors who exploit ChatGPT to launch cyberattacks. This paper examines the tactics that adversaries use to leverage ChatGPT in a variety of cyberattacks. Attackers pose as regular users and manipulate ChatGPT’s vulnerability to malicious interactions, particularly in the context of cyber assault. The paper presents illustrative examples of cyberattacks that are possible with ChatGPT and discusses the realm of ChatGPT-fueled cybersecurity threats. The paper also investigates the extent of user awareness of the relationship between ChatGPT and cyberattacks. A survey of 253 participants was conducted, and their responses were measured on a three-point Likert scale. The results provide a comprehensive understanding of how ChatGPT can be used to improve business processes and identify areas for improvement. Over 80% of the participants agreed that cyber criminals use ChatGPT for malicious purposes. This finding underscores the importance of improving the security of this novel model. Organizations must take steps to protect their computational infrastructure. This analysis also highlights opportunities for streamlining processes, improving service quality, and increasing efficiency. Finally, the paper provides recommendations for using ChatGPT in a secure manner, outlining ways to mitigate potential cyberattacks and strengthen defenses against adversaries.
ARTICLE | doi:10.20944/preprints201912.0028.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: deep learning; generative adversarial networks; image inpainting; diversity inpainting
Online: 3 December 2019 (12:04:01 CET)
The latest methods based on deep learning have achieved amazing results regarding the complex work of inpainting large missing areas in an image. This type of method generally attempts to generate one single "optimal" inpainting result, ignoring many other plausible results. However, considering the uncertainty of the inpainting task, one sole result can hardly be regarded as a desired regeneration of the missing area. In view of this weakness, which is related to the design of the previous algorithms, we propose a novel deep generative model equipped with a brand new style extractor which can extract the style noise (a latent vector) from the ground truth image. Once obtained, the extracted style noise and the ground truth image are both input into the generator. We also craft a consistency loss that guides the generated image to approximate the ground truth. Meanwhile, the same extractor captures the style noise from the generated image, which is forced to approach the input noise according to the consistency loss. After iterations, our generator is able to learn the styles corresponding to multiple sets of noise. The proposed model can generate a (sufficiently large) number of inpainting results consistent with the context semantics of the image. Moreover, we check the effectiveness of our model on three databases, i.e., CelebA, Agricultural Disease, and MauFlex. Compared to state-of-the-art inpainting methods, this model is able to offer desirable inpainting results with both a better quality and higher diversity. The code and model will be made available on https://github.com/vivitsai/SEGAN.
ARTICLE | doi:10.20944/preprints202311.1231.v1
Subject: Computer Science And Mathematics, Other Keywords: design brief; ai thinking; generative ai; design management; user experience
Online: 20 November 2023 (09:26:04 CET)
This study examines the impact of GenAI tools on the daily tasks of designers within corporations. It investigates both the operational changes and employee anxieties regarding job security. The research employs a qualitative approach: one without the use of ChatGPT and one with its use. The findings indicate significant improvements in operational experience and subjective perceptions across various tasks, as demonstrated through a user experience map. Moreover, the study highlights the potential of AI for enhancing managerial efficiency, streamlining workflows, and improving collaboration. However, it also addresses challenges concerning information authenticity, copyright protection, and professional identity. The goal of this study is to comprehend AI's current role in businesses, evaluate its effects on designers, and offer balanced recommendations that emphasize the integration of AI thinking into future corporate workflows from a human-centric perspective.
TECHNICAL NOTE | doi:10.20944/preprints202311.1130.v1
Subject: Computer Science And Mathematics, Software Keywords: Genomics; Variant Interpretation; Generative AI; Genome analysis; Rare Diseases; Bioinformatics
Online: 17 November 2023 (05:24:56 CET)
In the modern era of genomic research, the scientific community is witnessing an explosive growth in the volume of published findings. While this abundance of data offers invaluable insights, it also places a pressing responsibility on genetic professionals and researchers to stay informed about the latest findings and their clinical significance. Genomic variant interpretation is currently facing a challenge in identifying the most up-to-date and relevant scientific papers, while also extracting meaningful information to accelerate the process from clinical assessment to reporting. Computer-aided literature search and summarization can play a pivotal role in this context. By synthesizing complex genomic findings into concise, interpretable summaries, this approach facilitates the translation of extensive genomic datasets into clinically relevant insights. To bridge this gap, we present VarChat (varchat.engenome.com), an innovative tool based on generative AI, developed to find and summarize the fragmented scientific literature associated with genomic variants into brief yet informative texts.VarChat provides users with a concise description of specific genetic variants, detailing their impact on related proteins and possible effects on human health. Additionally, VarChat offers direct links to related scientific trustable sources, and encourages deeper research. Availability: VarChat is freely available at varchat.engenome.com
ARTICLE | doi:10.20944/preprints202310.1621.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: ballastless track; image shadow removal; generative adversarial network; computer vision
Online: 25 October 2023 (09:19:13 CEST)
Track fasteners play a pivotal role in infrastructure inspection for high-speed rail. Yet, images taken by drones often capture shadows cast by electrical towers flanking the high-speed rail tracks. These shadows can hinder the visibility of the track fasteners, thereby impacting detection efficiency and accuracy considerably. The present paper introduces an end-to-end shadow removal algorithm, rooted in generative adversarial network training. The comprehensive network framework is segmented into three sub-networks: pseudo-mask generation, shadow removal, and result refinement. We have integrated a Fourier convolutional residual module to bolster the feature extraction capability of the generator network. This integration ensures the network retains a global receptive field, even in its more superficial layers. By employing an overall weighted loss function, we enhance the quality of the images produced without shadows. Further, a perceptual loss function has been incorporated to retain the structural information of objects, setting the stage for subsequent defect detection. Our results highlight that Pse-ShadowNet adeptly eradicates fastener shadows while maintaining vital visual features, including object position, structure, texture, edges, and other key visual elements. Consequently, the reconstructed images are detailed and showcase superior image quality.
ARTICLE | doi:10.20944/preprints202307.1667.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: convolutional neural network; deep learning; skin cancer; generative adversarial network
Online: 25 July 2023 (11:47:15 CEST)
Keywords: Convolutional Neural Network, Deep Learning, Skin cancer, Generative Adversarial Network
ARTICLE | doi:10.20944/preprints201807.0340.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: dialectical generative adversarial network; image translation; Sentinel-1; TerraSAR-X
Online: 19 July 2018 (04:46:22 CEST)
Contrary to optical images, Synthetic Aperture Radar (SAR) images are in different electromagnetic spectrum where the human visual system is not accustomed to. Thus, with more and more SAR applications, the demand for enhanced high-quality SAR images has increased considerably. However, high-quality SAR images entail high costs due to the limitations of current SAR devices and their image processing resources. To improve the quality of SAR images and to reduce the costs of their generation, we propose a Dialectical Generative Adversarial Network (Dialectical GAN) to generate high-quality SAR images. This method is based on the analysis of hierarchical SAR information and the “dialectical” structure of GAN frameworks. As a demonstration, a typical example will be shown where a low-resolution SAR image (e.g., a Sentinel-1 image) with large ground coverage is translated into a high-resolution SAR image (e.g., a TerraSAR-X image). Three traditional algorithms are compared, and a new algorithm is proposed based on a network framework by combining conditional WGAN-GP (Wasserstein Generative Adversarial Network - Gradient Penalty) loss functions and Spatial Gram matrices under the rule of dialectics. Experimental results show that the SAR image translation works very well when we compare the results of our proposed method with the selected traditional methods.
ARTICLE | doi:10.20944/preprints202309.1541.v1
Subject: Social Sciences, Education Keywords: higher education; technology acceptance; generative pre-trained transformer; curriculum design; academia
Online: 22 September 2023 (08:39:44 CEST)
Artificial intelligence (AI)-based models hold the potential to transform higher education if adopted properly and ethically. A prime example is ChatGPT with earlier studies indicating its widespread adoption in academia and by university students. The current study aimed to identify the factors influencing the attitude towards ChatGPT and its usage among university students in Arab countries. A previous survey instrument termed Technology Acceptance Model Edited to Assess ChatGPT Adoption (TAME-ChatGPT) was administered online using a convenience-based approach among the contacts of the authors. Confirmatory factor analysis (CFA) for the survey constructs was done using root mean square error of approximation (RMSEA), standardized root mean square residual (SRMR), comparative fit index (CFI), and Tucker-Lewis index (TLI). The final study sample comprised a total of 2240 participant divided as follows: Iraq (n=736, 32.9%), Kuwait (n=582, 26.0%), Egypt (n=417, 18.6%), Lebanon (n=263, 11.7%), and Jordan (n=242, 10.8%). A total of 1048 respondents heard of ChatGPT before the study (46.8%), of which 551 used ChatGPT (52.6%). The mean scores of TAME-ChatGPT constructs showed that the ease of ChatGPT use, positive attitude towards technology and social influence, higher perceived usefulness, the influence of behavioral/cognitive factors, low perceived risks and low anxiety were the determinants of positive attitude to ChatGPT and its use. For both the attitude and usage scales of TAME-ChatGPT, CFA collectively yielded satisfactory fit indices as indicated by low RMSEA and SRMR together with high CFI and TLI. Multivariate analysis showed that attitude to ChatGPT use was significantly influenced by country of residence, age, university type, and the latest grade point average of the students. The current study confirmed the validity of TAME-ChatGPT as a survey instrument to assess the possible determinants of ChatGPT use among university students in Arab countries. The study findings highlighted that successful adoption of ChatGPT in higher education could be dependent on perceived ease of use, usefulness, positive attitudes to technology, social influence, behavioral/cognitive factors, lower anxiety, and minimal perceived risks. The utility of ChatGPT in higher education requires policies that should be tailored for various settings, considering the differences observed in attitude towards ChatGPT among participating students in this study.
ARTICLE | doi:10.20944/preprints201806.0230.v1
Subject: Social Sciences, Education Keywords: interdisciplinary communication; early architectural design stage; procedural information; generative modeling; dashboard
Online: 14 June 2018 (10:33:18 CEST)
The purpose of interdisciplinary communication during the early architectural design stage is to achieve the early integration of knowledge in different professional fields, which can help architects to choose correct design development strategies during the early architectural design stage. However, because there is too little information at the early design stage, and design solutions are still rapidly changing and developing, the uncertainties at this stage make it difficult for consultants in other disciplines to provide their views and analysis. In spite of this situation, the emergence of generative modeling is changing design procedures and methods of communication and cooperation for architectural teams, and has brought about a shift in the way architects transmit design information from "what" (declarative information) to "how" (procedural information). Generative modeling is like an aircraft's dashboard: It can provide a basis for interdisciplinary communication, provide interdisciplinary knowledge packages, and bring about a shift in interdisciplinary communication that will reduce the architectural team's communication needs and cost. This study uses a real design case to show the feasibility of generative modeling. Employing generative modeling as a basis, architects can enhance the efficiency of design change and multi-disciplinary communication during the early design stage, integrate specialized knowledge in relevant fields, use this knowledge to formulate design criteria for the next stage, and effectively transmit design decisions. As a consequence, the changes to the cost structure of design revisions and communication between different disciplines has initiated a paradigm shift toward multi-disciplinary communication in architectural design.
ARTICLE | doi:10.20944/preprints202306.0323.v1
Subject: Computer Science And Mathematics, Information Systems Keywords: text mining; semantic analysis; labeling bull-bear words; futures corpus; generative AI
Online: 5 June 2023 (13:18:24 CEST)
For highly time-constrained very short-term investors, reading and extracting valuable information from financial news poses significant challenges. The wide range of topics covered in these news articles further compounds the difficulties for investors. The diverse content adds complexity and uncertainty to the text, making it arduous for very short-term investors to swiftly and accurately extract valuable insights. Variations in authors, media sources, and cultural backgrounds also introduce additional complexities. Hence, performing a bull-bear semantic analysis of financial news using text mining technologies can alleviate the volume, time, and energy pressures on very short-term investors while enhancing the efficiency and accuracy of their investment decisions. This study proposes labeling bull-bear words from a futures corpus detection method that extracts valuable information from financial news, allowing investors to understand market trends quickly. Generative AI models are trained to provide real-time bull-bear advice, aiding investors in adapting to market changes and devising effective trading strategies. Experimental results show the effectiveness of various models, with Random Forest and SVMs achieving an impressive 80% accuracy rate. MLP and Deep learning models also perform well. By leveraging these models, the study reduces the time spent reading financial articles, enabling faster decision-making and increasing the likelihood of investment success. Future research can explore the application of this method in other domains and enhance model design for improved predictive capabilities and practicality.
REVIEW | doi:10.20944/preprints202305.0900.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Generative Artificial Intelligence; Large Language Models; ChatGPT; Bard; Transformer Architecture; Prompt Engineering
Online: 12 May 2023 (07:47:13 CEST)
This primer provides an overview of the rapidly evolving field of generative artificial intelligence, specifically focusing on large language models like ChatGPT (OpenAI) and Bard (Google). Large language models have demonstrated unprecedented capabilities in responding to natural language prompts. The aim of this primer is to demystify the underlying theory and architecture of large language models, providing intuitive explanations for a broader audience. Learners seeking to gain insight into the technical underpinnings of large language models must sift through rapidly growing and fragmented literature on the topic. This primer brings all the main concepts into a single digestible document. Topics covered include text tokenization, vocabulary construction, token embedding, context embedding with attention mechanisms, artificial neural networks, and objective functions in model training. The primer also explores state-of-the-art methods in training large language models to generalize on specific applications and to align with human intentions. Finally, an introduction to the concept of prompt engineering highlights the importance of effective human-machine interaction through natural language in harnessing the full potential of artificial intelligence chatbots. This comprehensive yet accessible primer will benefit students and researchers seeking foundational knowledge and a deeper understanding of the inner workings of existing and emerging artificial intelligence models. The author hopes that the primer will encourage further responsible innovation and informed discussions about these increasingly powerful tools.
ARTICLE | doi:10.20944/preprints202011.0696.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Archaeological Data Science; Artificial Intelligence; Unsupervised Learning; Generative Adversarial Networks; Robust Statistics.
Online: 27 November 2020 (14:43:36 CET)
The fossil record is notorious for being incomplete and distorted, frequently conditioning the type of knowledge that can be extracted from it. In many cases, this often leads to issues when performing complex statistical analyses, such as classification tasks, predictive modelling, and variance analyses, such as those used in Geometric Morphometrics. Here different Generative Adversarial Network architectures are experimented with, testing the effects of sample size and domain dimensionality on model performance. For model evaluation, robust statistical methods were used. Each of the algorithms were observed to produce realistic data. Generative Adversarial Networks using different loss functions produced multidimensional synthetic data significantly equivalent to the original training data. Conditional Generative Adversarial Networks were not as successful. The methods proposed are likely to reduce the impact of sample size and bias on a number of statistical learning applications. While Generative Adversarial Networks are not the solution to all sample-size related issues, combined with other pre-processing steps these limitations may be overcome. This presents a valuable means of augmenting geometric morphometric datasets for greater predictive visualization.
ARTICLE | doi:10.20944/preprints202308.2048.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: document binarization; deep learning; gated convolution; generative model; latent diffusion models; text stroke
Online: 30 August 2023 (08:23:05 CEST)
Binarization of degraded documents is an important preprocessing task for various document analysis such as OCR and historical document analysis. Existing studies have applied various convolutional neural network (CNN) models and generative models for document binarization, but they do not show generalized performance for noise that the model has not seen and it suffers from extracting elaborate text strokes. In this paper, to overcome these challenges, we utilize latent diffusion model (LDM), which is known for high-quality image generation model, for the first time in document binarization. By utilizing the iterative diffusion-denoising process in latent space, it shows high-quality cleaned binarized image generation and high generalized performance through using both data distribution and time step while training. Additionally, we apply gated U-Net to the backbone network to preserve text strokes using trainable gating value. Gated convolution can extract elaborate text stroke by allowing the model to focus on text region by combining gating value and feature. Furthermore, we maximize the effectiveness of the proposed model by training it with a combination of LDM loss and pixel-level loss, which is suitable for the model structure. Experiments on H-DIBCO and DIBCO benchmark datasets show that the proposed model outperforms existing methods.
ARTICLE | doi:10.20944/preprints202308.1979.v1
Subject: Business, Economics And Management, Finance Keywords: deep learning; system engineering; stock price forecasting; aggregate dynamic behavior; generative adversarial network
Online: 30 August 2023 (03:03:52 CEST)
Current stock market forecasting methods encompass fundamental, technical, emotional, and bargaining factors. Predominantly, price prediction hinges on order volume and price, although correlating these two within existing models proves challenging. This study employs Cycle Generative Adversarial Network (Cycle GAN) to unravel the intricate price-volume relationship, combining it with Bollinger Bands for trading signal analysis, overcoming hurdles in short-term forecasting prevalent in numerical analysis and AI. Focusing on TSMC (2330.TW) stock price, the research leverages Cycle GAN in deep learning to master the price-volume nexus, juxtaposed with LSTM and RNN. Historical TSMC closing prices and transaction counts are model inputs, scrutinizing their interconnectedness for predictions. This innovative approach aligns stock price, volume, market value, taxes, and prior changes via system engineering. By intertwining Bollinger Bands with stock price forecasts, trading signals are distilled, factoring in extended index %b for a comprehensive market picture.
ARTICLE | doi:10.20944/preprints202307.2143.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: digital twin; oscillation source localization; generative adversarial imputation network; super resolution; Cloudpss-Xstudio
Online: 1 August 2023 (03:45:10 CEST)
Aiming at the difficult problem of broadband oscillation localization in power systems, the intel-ligent localization method of oscillation source based on the digital twin is proposed and the os-cillation source localization system based on the digital twin is constructed. Firstly, a digital twin-based oscillation source localization method and system architecture are proposed. Fur-thermore, an intelligent positioning method of oscillation source based on data-driven and mechanism fusion is proposed. It includes three steps: oscillation signal preprocessing, oscillation modal analysis and oscillation source localization. For the oscillation signal preprocessing, the generative adversarial imputation network is used to repair the missing samples, and the su-per-resolution technique is used to realize the super-resolution measurement of broadband os-cillation. In the oscillation modal analysis, the spectrum of the oscillation signal is extracted using the fast Fourier transform. To accurately locate the oscillation source, the branch potential energy is used as input to the data-driven model, such as LSTM and CNN. Finally, an oscillation source localization system is developed based on the digital twin workshop CloudPSS-XStudio, which can locate the oscillation source quickly and accurately.
ARTICLE | doi:10.20944/preprints202306.1738.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: computer vision; deep learning; signal quality; horticulture; datasets; enhancement; generative AI; super-resolution
Online: 26 June 2023 (05:03:33 CEST)
The use of visual signals in horticulture has attracted significant attention and encompassed a wide range of data types such as 2D images, videos, hyperspectral images, and 3D point clouds. These visual signals have proven to be valuable in developing cutting-edge computer vision systems for various applications in horticulture, enabling plant growth monitoring, pest and disease detection, quality and yield estimation, and automated harvesting. However, unlike other sectors, developing deep learning computer vision systems for horticulture encounters unique challenges due to the limited availability of high-quality training and evaluation datasets necessary for deep learning models. This paper investigates the current status of vision systems and available data in order to identify the high-quality data requirements specific to horticultural applications. We analyse the impact of the quality of visual signals on the information content and features that can be extracted from these signals. To address the identified data quality requirements, we explore the usage of a deep learning-based super-resolution model for generative quality enhancement of visual signals. Furthermore, we discuss how these can be applied to meet the growing requirements around data quality for learning-based vision systems. We also present a detailed analysis of the competitive quality generated by the proposed solution compared to cost-intensive hardware-based alternatives. This work aims to guide the development of efficient computer vision models in horticulture by overcoming existing data challenges and paving a pathway forward for contemporary data acquisition.
ARTICLE | doi:10.20944/preprints202301.0031.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Spatially Adaptive De-normalization (SPADE); Super-Resolution; Convolutional Neural Network; Generative Adversarial Network)
Online: 9 January 2023 (02:15:56 CET)
With the development of deep learning technology, various structures and research methods for super-resolution restoration of natural images and document images have been introduced. In particular, a number of recent studies have been conducted and developed in image restoration using generative adversarial network. Super-resolution restoration is ill-posed problem because of some complex restraints such as a lot of high-resolution images being restored for the same low-resolution image and also difficulty in restoring noises like edges, light smudging, and blurring. In this study, we utilized the spatially adaptive de-normalization (SPADE) structure for document image restoration to solve previous problems such as edge unclearness, hardness to catch features of texts, and the image color transition. Consequently, it can be confirmed that the edge of the character and the ambiguous stroke are restored more clearly when contrasting with the other previously suggested methods. Also, the proposed method’s PSNR and SSIM scores are geting 8% and 15% higher, respectively, compared to the previous methods.
REVIEW | doi:10.20944/preprints202201.0224.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: deep learning; machine learning; histopathology; computational pathology; convolutional neural networks; generative adversarial networks
Online: 17 January 2022 (12:31:25 CET)
Deep learning techniques, such as convolutional neural networks (CNN), generative adversarial networks (GAN), and graph neural networks (GNN), have over the past decade changed the ac-curacy of prediction in many diverse fields. In recent years, the application of deep learning tech-niques in computer vision tasks in pathology demonstrated extraordinary potential in assisting clinicians, automating diagnosis, and reducing costs for patients. Formerly unknown pathologi-cal evidence, such as morphological features related to specific biomarkers, copy number varia-tions, and other molecular features, were also able to be captured by deep learning models. In this paper, we review popular deep learning methods and some recent publications about their appli-cations in pathology.
ARTICLE | doi:10.20944/preprints202110.0355.v1
Subject: Computer Science And Mathematics, Mathematical And Computational Biology Keywords: Deep Learning; Electrospray; Mass Spectrometry; Metabolomics; Artificial Intelligence; Generative Methods; Chemical Space; Transformers
Online: 25 October 2021 (13:20:47 CEST)
The ‘inverse problem’ of mass spectrometric molecular identification (‘given a mass spectrum, calculate/predict the 2D structure of the molecule whence it came’) is largely unsolved, and is especially acute in metabolomics where many small molecules remain unidentified. This is largely because the number of experimentally available electrospray mass spectra of small molecules is quite limited. However, the forward problem (‘calculate a small molecule’s likely fragmentation and hence at least some of its mass spectrum from its structure alone’) is much more tractable, because the strengths of different chemical bonds are roughly known. This kind of molecular identification problem may be cast as a language translation problem in which the source language is a list of high-resolution mass spectral peaks and the ‘translation’ a representation (for instance in SMILES) of the molecule. It is thus suitable for attack using the deep neural networks known as transformers. We here present MassGenie, a method that uses a transformer-based deep neural network, trained on ~6 million chemical structures with augmented SMILES encoding and their paired molecular fragments as generated in silico, explicitly including the protonated molecular ion. This architecture (containing some 400 million elements) is used to predict the structure of a molecule from the various fragments that may be expected to be observed when some of its bonds are broken. Despite being given essentially no detailed nor explicit rules about molecular fragmentation methods, isotope patterns, rearrangements, neutral losses, and the like, MassGenie learns the effective properties of the mass spectral fragment and valency space, and can generate candidate molecular structures that are very close or identical to those of the ‘true’ molecules. We also use VAE-Sim, a previously published variational autoencoder, to generate candidate molecules that are ‘similar’ to the top hit. In addition to using the ‘top hits’ directly, we can produce a rank order of these by ‘round-tripping’ candidate molecules and comparing them with the true molecules, where known. As a proof of principle, we confine ourselves to positive electrospray mass spectra from molecules with a molecular mass of 500Da or lower, including those in the last CASMI challenge (for which the results are known), getting 49/93 (53%) precisely correct. The transformer method, applied here for the first time to mass spectral interpretation, works extremely effectively both for mass spectra generated in silico and on experimentally obtained mass spectra from pure compounds. It seems to act as a Las Vegas algorithm, in that it either gives the correct answer or simply states that it cannot find one. The ability to create and to ‘learn’ millions of fragmentation patterns in silico, and therefrom generate candidate structures (that do not have to be in existing libraries) directly, thus opens up entirely the field of de novo small molecule structure prediction from experimental mass spectra.
ARTICLE | doi:10.20944/preprints202104.0690.v1
Subject: Arts And Humanities, Architecture Keywords: : Artificial Intelligence; Generative Adversarial Network; Machine Learning; Computationl Design; Urban infill; Facade design
Online: 26 April 2021 (20:20:29 CEST)
Artificial Intelligence and especially machine learning have noticed rapid advancement on image processing operations. However, its involvement in the architectural design is still in its initial stages compared to other disciplines. Therefore, this paper addresses the issues of developing an integrated bottom up digital design approach and details a research framework for the incorporation of Deep Convolutional Generative Adversarial Network (GAN) for early stage design exploration and generation of intricate and complex alternative facade designs for urban infill. This paper proposes a novel building facade design by merging two neighboring building’s architecture style, size, scale, openings, as reference to create a new building design in the same neighborhood for urban infill. This newly produced building contains the outline, style and shape of the parent buildings. A 2D urban infill building design is generated as a picture where 1) neighboring buildings are imported as a reference using mobile phone and 2)iFACADE decode their spatial adjacency. It is depicted the iFACADE will be useful for designers in the early design stage to generate new façades depending on existing buildings in a short time that will save time and energy. Besides, building owners can use iFACADE to show their architects their preferred architecture facade by mixing two building styles and generating a new building. Therefore, it is depicted that iFACADE can become a communication platform in the early design stages between architects and owners. Initial results properly define a heuristic function for generating abstract design facade elements and sufficiently illustrate the desired functionality of our developed prototype.
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: Deep learning; artificial intelligence; generative methods; chemical space; neural networks; transformers; attention; cheminformatics
Online: 3 March 2021 (09:34:54 CET)
The question of molecular similarity is core in cheminformatics, and is usually assessed via a pairwise comparison based on vectors of properties or molecular fingerprints. We recently exploited variational autoencoders to embed 6M molecules in a chemical space, such that their (Euclidean) distance within the latent space so formed could be assessed within the framework of the entire molecular set. However, the standard objective function used did not seek to manipulate the latent space so as to cluster the molecules based on any perceived similarity. Using a set of some 160,000 molecules of biological relevance, we here bring together three modern elements of deep learning to create a novel and disentangled latent space, viz transformers, contrastive learning, and an embedded autoencoder. The effective dimensionality of the latent space was varied such that clear separation of individual types of molecules could be observed within individual dimensions of the latent space. The capacity of the network was such that many dimensions were not populated at all. As before, we assessed the utility of the representation by comparing clozapine with its near neighbours, and did the same for various antibiotics related to flucloxacillin. Transformers, especially when as here coupled with contrastive learning, effectively provide one-shot learning, and lead to a successful and disentangled representation of molecular latent spaces that at once uses the entire training set in their construction while allowing ‘similar’ molecules to cluster together in an effective and interpretable way.
CONCEPT PAPER | doi:10.20944/preprints202309.1513.v1
Subject: Chemistry And Materials Science, Other Keywords: Sustainability; Sustainable Development Goals (SDGs); Chemistry education; Ethical AI; Curriculum; Data science; Generative AI
Online: 22 September 2023 (09:10:52 CEST)
As the world faces unprecedented environmental challenges and the rapid advancements of artificial intelligence (AI), it is crucial for universities to adapt their chemistry education to remain relevant and contribute to sustainability and the United Nations' Sustainable Development Goals (SDGs). This report offers critical insights into the development of university chemistry education over the next 10-20 years, forecasts its development over the next 10-20 years, and offers recommendations for the Ministry of Education and universities to ensure the continued relevance of chemistry programs in the face of AI.
ARTICLE | doi:10.20944/preprints202307.0766.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: ChatGPT; Digital Forensics; Artificial Intelligence; Generative Pre-trained Transformers (GPT); Large Language Models (LLM)
Online: 12 July 2023 (09:09:09 CEST)
The disruptive application of ChatGPT (GPT-3.5, GPT-4) to a variety of domains has become a topic of much discussion in the scientific community and society at large. Large Language Models (LLMs), e.g., BERT, Bard, Generative Pre-trained Transformers (GPTs), LLaMA, etc., have the ability to take instructions, or prompts, from users and generate answers and solutions based on very large volumes of text-based training data. This paper assesses the impact and potential impact of ChatGPT on the field of digital forensics, specifically looking at its latest pre-trained LLM, GPT-4. A series of experiments are conducted to assess its capability across several digital forensic use cases including artefact understanding, evidence searching, code generation, anomaly detection, incident response, and education. Across these topics, its strengths and risks are outlined and a number of general conclusions are drawn. Overall this paper concludes that while there are some potential low-risk applications of ChatGPT within digital forensics, many are either unsuitable at present, since the evidence would need to be uploaded to the service, or they require sufficient knowledge of the topic being asked of the tool to identify incorrect assumptions, inaccuracies, and mistakes. However, to an appropriately knowledgeable user, it could act as a useful supporting tool in some circumstances.
ARTICLE | doi:10.20944/preprints202305.1202.v1
Subject: Engineering, Aerospace Engineering Keywords: Generative model; Knowledge-Based Engineering; Design automation; Conceptual design; Aerospace Engineering; Computer Aided Design
Online: 17 May 2023 (07:14:32 CEST)
This thesis presents the effects of work done on a software project for generative models and spreadsheets, allowing for a quick creation of the conceptual model of the aircraft. The subject of the work is a response to the current trends and needs prevailing in the field of computer design engineering CAD and aviation. In the initial chapters, theoretical issues related to the work being carried out were introduced and the methodology of creating software for construction and verification of the structure of aircraft along with the needs of interchange between databases of generative models was presented. In the next stages, the concepts and selected solutions for the user interface supporting the knowledge base were presented along with a set of procedures for its operation. Furthermore, the method of database integration with the methods of determining design features for the developed generative models and with the Siemens NX system. Furthermore, problems encountered in software development, as well as solution examples for model application are specified. The results obtained and the models generated on their basis were subjected to a strength analysis using Autodesk Inventor software and analysed in terms of meeting the initial assumptions. In the end, conclusions and observations resulting from the effects of the work presented in the project were formulated.
ARTICLE | doi:10.20944/preprints201905.0352.v5
Subject: Physical Sciences, Quantum Science And Technology Keywords: Coherence; Compression; Computation; Consciousness; Entanglement; Free-energy principle; Generative model; Information; Qualia; Quantum; Representation
Online: 14 October 2019 (09:53:26 CEST)
The QBIT theory is an attempt toward solving the problem of consciousness based on empirical evidence provided by various scientific disciplines including quantum mechanics, biology, information theory, and thermodynamics. This theory formulates the problem of consciousness in the following four questions: (1) What is the nature of qualia? (2) How are qualia generated? (3) Why are qualia subjective? (4) Why does a quale have a particular quality or meaning?In sum, the QBIT theory proposes that (1) when a pack of quantum information is compressed beyond a certain threshold, a quale is generated; (2) a quale is a superdense pack of maximally entangled qubits in a pure state; (3) when information-theoretic certainty of a system about an external stimulus exceeds a particular level, the system becomes conscious of that stimulus; (4) subjectivity of consciousness is due to the fact that maximally entangled pure states are private and unshareable.
ARTICLE | doi:10.20944/preprints201907.0121.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Artificial Neural Networks; Deep Learning; Generative Neural Networks; Incremental Learning; Novelty detection; Catastrophic Interference
Online: 8 July 2019 (14:29:28 CEST)
Deep learning models are part of the family of artificial neural networks and, as such, it suffers of catastrophic interference when they learn sequentially. In addition, most of these models have a rigid architecture which prevents the incremental learning of new classes. To overcome these drawbacks, in this article we propose the Self-Improving Generative Artificial Neural Network (SIGANN), a type of end-to-end Deep Neural Network system which is able to ease the catastrophic forgetting problem when leaning new classes. In this method, we introduce a novelty detection model to automatically detect samples of new classes, moreover an adversarial auto-encoder is used to produce samples of previous classes. This system consists of three main modules: a classifier module implemented using a Deep Convolutional Neural Network, a generator module based on an adversarial autoencoder; and a novelty detection module, implemented using an OpenMax activation function. Using the EMNIST data set, the model was trained incrementally, starting with a small set of classes. The results of the simulation show that SIGANN is able to retain previous knowledge with a gradual forgetfulness for each learning sequence. Moreover, SIGANN can detect new classes that are hidden in the data and, therefore, proceed with incremental class learning.
ARTICLE | doi:10.20944/preprints202308.1174.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Generative adversarial networks (GANs); SAR image generation; speckle noise; deceptive jamming; synthetic aperture radar (SAR)
Online: 16 August 2023 (10:12:21 CEST)
To realize fast and effective synthetic aperture radar (SAR) deception jamming, a high-quality SAR deception jamming template library can be generated by performing sample augmentation on SAR deception jamming templates. However, current sample augmentation schemes of SAR deception jamming templates face certain problems. First, the authenticity of templates is low due to the lack of speckle noise. Second, the generated templates have low similarity to the target and shadow areas of the input templates. To solve these problems, this study proposes a sample augmentation scheme based on generative adversarial networks, which can generate a high-quality library of SAR deception jamming templates with shadows. The proposed scheme solves the two aforementioned problems from the following aspects. First, the influence of the speckle noise is considered in the network to avoid the problem of reduced authenticity in generated images. Second, a channel attention mechanism module is used to improve the network's learning ability of shadow features, which improves the similarity between the generated template and the shadow area in the input template. Finally, the proposed scheme and the SinGAN scheme are compared regarding the equivalent numbers of looks and the structural similarity between the target and shadow in the sample augmentation results. The comparison results demonstrate that, compared to the templates generated by the SinGAN scheme, those generated by the proposed scheme have targets and shadow features similar to those of the original image, and can incorporate speckle noise characteristics, resulting in higher authenticity, which helps to achieve fast and effective SAR deception jamming.
ARTICLE | doi:10.20944/preprints202306.0078.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Natural language processing; text classification; probabilistic models; machine learning; generative learning; collaborative learning; explainable AI
Online: 5 June 2023 (02:57:36 CEST)
The use of artificial intelligence in natural language processing (NLP) has significantly contributed to the advancement of natural language applications such as sentimental analysis, topic modeling, text classification, chatbots, and spam filtering. With a large amount of text generated each day from different sources such as webpages, blogs, emails, social media, and articles, one of the most common tasks in natural language processing is the classification of a text corpus. This is important in many institutions for planning, decision-making, and archives of their projects. Many algorithms exist to automate text classification operations but the most intriguing of them is that which also learns these operations automatically. In this study, we present a new model to infer and learn from data using probabilistic logic and apply it to text classification. This model, called GenCo, is a multi-input single-output (MISO) learning model that uses a collaboration of partial classifications to generate the desired output. It provides a heterogeneity measure to explain its classification results and enables the reduction of the curse of dimensionality in text classification. The classification results are compared with those of conventional text classification models, and it shows that our proposed model has a higher classification performance than conventional models.
ARTICLE | doi:10.20944/preprints202305.0161.v1
Subject: Engineering, Architecture, Building And Construction Keywords: machine learning; conditional generative adversarial network (CGAN); historic district; facade design; decoration style; urban renewal
Online: 3 May 2023 (14:50:45 CEST)
In recent years, artificial intelligence technology has widely influenced the field of design, bringing new ideas to efficiently and systematically solve urban renewal design problems. The purpose of this study is to create a stylized generation technology for building facade decoration in historic districts, which will aid in the design and control of district style and form. The goal is to use the technical advantages of conditional generative adversarial network (CGAN) in image generation and style transfer to create a method for independently designing a specific facade decoration style by interpreting image data of historical district facades. The research in this paper is based on the historical district of Putian in Fujian Province, through an experiment of image data acquisition, image processing and screening, model training, image generation, and style matching of the target area. The research found that: (1) CGAN technology can better identify and generate the decorative style of historical districts. It can realize the overall or partial scheme design of the facade; (2) in terms of adaptability, this method can provide a better scheme reference for historical district reconstruction, facade renovation, and renovation design projects. Especially for districts with obvious decorative styles, the visualization effect is better. In addition, it also has certain reference significance for the determination and design of the facade decoration style of a specific historical building; (3) This method can better learn the internal laws of the complex district style and form so as to generate a new design with a clear decoration style attribute. It can be extended to other fields of historical heritage protection to enhance practitioners' stylized control of the heritage environment and improve the efficiency and ability of professional design.
ARTICLE | doi:10.20944/preprints202302.0117.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Synthetic categorical data generation; generative adversarial networks; imbalance learning; CTGAN; interpretable machine learning; cardiovascular disease
Online: 7 February 2023 (03:43:16 CET)
Machine Learning (ML) methods have become important to enhance the performance of decision-support predictive models. However, class imbalance is one of the main challenges for developing ML models, because it limits the generalization of these models, and biases the learning algorithms. In this paper, we consider oversampling methods for generating synthetic categorical clinical data aiming to improve the predictive performance in ML models, and the identification of risk factors for cardiovascular diseases (CVDs). We performed a comparative study of several categorical synthetic data generation methods, including Generative Adversarial Networks (GANs). Then, we assessed the impact of combining oversampling strategies and linear and nonlinear supervised ML methods. Lastly, we conducted a post-hoc model interpretability based on the importance of the risk factors. Experimental results show the potential of GAN-based models for generating high-quality categorical synthetic data, yielding probability mass functions that are highly close to real data, maintaining relevant insights, and contributing to increase the predictive performance. The GAN-based model and a linear classifier outperforms other oversampling techniques, improving 2\% the area under the curve. These results demonstrate the capability of synthetic data to help both in determining risk factors and building models for CVD prediction.
REVIEW | doi:10.20944/preprints202108.0060.v1
Subject: Computer Science And Mathematics, Information Systems Keywords: deep learning; artificial neural network; artificial intelligence; discriminative learning; generative learning; hybrid learning; intelligent systems;
Online: 2 August 2021 (17:33:48 CEST)
Deep learning (DL), a branch of machine learning (ML) and artificial intelligence (AI) is nowadays considered as a core technology of today's Fourth Industrial Revolution (4IR or Industry 4.0). Due to its learning capabilities from data, DL technology originated from artificial neural network (ANN), has become a hot topic in the context of computing, and is widely applied in various application areas like healthcare, visual recognition, cybersecurity, and many more. However, building an appropriate DL model is a challenging task, due to the dynamic nature and variations in real-world problems and data. Moreover, the lack of core understanding turns DL methods into black-box machines that hamper development at the standard level. This article presents a structured and comprehensive view on DL techniques including a taxonomy considering various types of real-world tasks like supervised or unsupervised. In our taxonomy, we take into account deep networks for supervised or discriminative learning, unsupervised or generative learning as well as hybrid learning and relevant others. We also summarize real-world application areas where deep learning techniques can be used. Finally, we point out ten potential aspects for future generation DL modeling with research directions. Overall, this article aims to draw a big picture on DL modeling that can be used as a reference guide for both academia and industry professionals.
ARTICLE | doi:10.20944/preprints201712.0128.v1
Subject: Medicine And Pharmacology, Neuroscience And Neurology Keywords: synesthesia; deep learning network; color perception; generative adversarial network; cognitive modeling; character recognition; GPU computing
Online: 19 December 2017 (06:45:50 CET)
Synesthesia is a psychological phenomenon where sensory signals become mixed. Input to one sensory modality produces an experience in a second, unstimulated modality. In “grapheme-color synesthesia”, viewed letters and numbers evoke mental imagery of colors. The study of this condition has implications for increasing our understanding of brain architecture and function, language, memory and semantics, and the nature of consciousness. In this work, we propose a novel application deep learning to model perception in grapheme-color synesthesia. Achromatic letter images, taken from database of handwritten characters, are used to induce synesthesia. Results show the model learns to accurately create a colored version of the inducing stimulus, according to a statistical distribution from experiments on a sample population of grapheme-color synesthetes. The spontaneous, creative mental imagery characteristic of the synesthetic perceptual experience is reproduced by the model. A model of synesthesia that generates testable predictions on brain activity and behavior is needed to complement large scale data collection efforts in neuroscience, especially when articulating simple descriptions of cause (stimulus) and effect (behavior). The research and modeling approach reported here begins to address this need.
ARTICLE | doi:10.20944/preprints202309.0244.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: image inpainting; generative adversarial networks (GANs); lightweight architecture; conditional normalization; dilated convolution; dense block; self-attention
Online: 5 September 2023 (09:11:57 CEST)
Research on image-inpainting tasks has mainly focused on enhancing performance by augmenting various stages and modules. However, this trend does not consider the increase in the number of model parameters and operational memory, which increases the burden on computational resources. To solve this problem, we propose a Parametric Efficient Image InPainting Network (PEIPNet) for efficient and effective image-inpainting. Unlike other state-of-the-art methods, the proposed model has a one-stage inpainting framework in which depthwise and pointwise convolutions are adopted to reduce the number of parameters and computational costs. To generate semantically appealing results, we selected three unique components: spatially-adaptive denormalization (SPADE), dense dilated convolution module (DDCM), and efficient self-attention (ESA). The SPADE was adopted to conditionally normalize activations according to the mask to distinguish between damaged and undamaged regions. The DDCM was employed at every scale to overcome the gradient-vanishing obstacle and gradually fill-in pixels by capturing global information along the feature maps. The ESA was utilized to obtain clues from unmasked areas by extracting long-range information. In terms of efficiency, our model has the lowest operational memory compared with other state-of-the-art methods. Both qualitative and quantitative experiments demonstrate the generalized inpainting of our method on three public datasets: Paris StreetView, CelebA, and Places2.
ARTICLE | doi:10.20944/preprints202304.0350.v3
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: ChatGPT; Generative AI; Fake Publications; Human-Generated Publications; Supervised Learning; ML Algorithm; Fake Science; NeoNet Algorithm
Online: 18 August 2023 (11:19:23 CEST)
Background: ChatGPT is becoming a new reality. Where do we go from here? Objective: to show how we can distinguish ChatGPT-generated publications from counterparts produced by scientist. Methods:By means of a new algorithm, called xFakeBibs, we show the significant difference between ChatGPT-generated fake publications and real publications. Specifically, we triggered ChatGPT to generate 100 publications that were related to Alzheimer’s disease and comorbidity. Using TF-IDF, using the real publications, we constructed a training network of bigrams comprised of 100 publications. Using 10-folds of 100 publications each, we also 10 calibrating networks to derive lower/upper bounds for classifying articles as real or fake. The final step was to test xFakeBibs against each of the ChatGPT-generated articles and predict its class. The algorithm successfully assigned the POSITIVE label for real ones and NEGATIVE for fake ones. Results: When comparing the bigrams of the training set against all the other 10 calibrating folds, we found that the similarities fluctuated between (19%-21%). On the other hand, the mere bigram similarity from the ChatGPT was only (8%). Additionally, when testing how the various bigrams generated from the calibrating 10-folds against ChatGPT we found that all 10 calibrating folds contributed (51%-70%) of new bigrams, while ChatGPT contributed only 23%, which is less than 50% of any of the other 10 calibrating folds. The final classification results using the xFakeBibs set a lower/upper bound of (21.96-24.93) number of new edges to the training mode without contributing new nodes. Using this calibration range, the algorithm predicted 98 of the 100 publications as fake, while 2 articles failed the test and were classified as real publications. Conclusions: This work provided clear evidence of how to distinguish, in bulk ChatGPT-generated fake publications from real publications. Also, we also introduced an algorithmic approach that detected fake articles with a high degree of accuracy. However, it remains challenging to detect all fake records. ChatGPT may seem to be a useful tool, but it certainly presents a threat to our authentic knowledge and real science. This work is indeed a step in the right direction to counter fake science and misinformation.
ARTICLE | doi:10.20944/preprints202312.0379.v1
Subject: Education, Social Sciences Keywords: AI for education; ChatGPT; educational technologies; LLMs; data science education; psychological impact of AI; generative AI; validity
Online: 6 December 2023 (10:34:59 CET)
The integration of AI in education, particularly through Large Language Models (LLMs), is accelerating, reshaping pedagogical approaches and student interactions with new learning tools. This study assesses the impact of generative AI on the educational experiences of data science students at the Center for Informatics, University of Paraíba (CI/UFPB), Brazil. Through the validation of five psychometric scales, we analyze the students' acceptance of LLMs, their associated burnout levels, technology anxiety and the prevalence of metacognitive and dysfunctional learning strategies. Results indicate a significant adoption of AI-driven technologies, with a low incidence of technology anxiety, such as fear of job displacement due to AI. However, a significant correlation between burnout and dysfunctional learning strategies was observed, which could likely be attributed to the rigorous academic environment. Additionally, the employment of metacognitive strategies in conjunction with LLMs reflects an advanced learning approach, yet challenges with functional learning strategies persist. This study contributes to the discourse on AI in education, highlighting the need for educational frameworks that support effective AI adoption while addressing the psychological demands on students.
ARTICLE | doi:10.20944/preprints202306.0098.v1
Subject: Computer Science And Mathematics, Probability And Statistics Keywords: Bayesian nonparametric model; heterogeneity; missing at random; log-normal sum approximation; aggregate insurance claims; clustering; generative model
Online: 1 June 2023 (13:47:57 CEST)
In actuarial practice, the modeling of total losses tied to a certain policy is a non-trivial task. Traditional parametric models to predict total losses have limitations due to complex distributional features such as extreme skewness, zero inflation, multi-modality, etc., and the lack of explicit solutions for log-normal convolution. In the recent literature, the application of the Dirichlet process mixture for insurance loss has been proposed to eliminate the risk of model misspecification biases; however, the effect of covariates as well as missing covariates in the modeling framework is rarely studied. In this article, we propose novel connections among covariate-dependent Dirichlet process mixture, log-normal convolution, and missing covariate imputation. Assuming an individual loss is log-normally distributed, we develop a log skew-normal Dirichlet process to approximate the log-normal sum. As a generative approach, our framework models the joint of outcome and covariates, which allows to impute missing covariates under the assumption of missingness at random. The performance is assessed by applying our model to several insurance datasets, and the empirical results demonstrate the benefit of our model compared to the existing actuarial models such as the Tweedie-based generalized linear model, generalized additive model, or multivariate adaptive regression spline.
ARTICLE | doi:10.20944/preprints201811.0252.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: object recognition; image data synthesizing; Human-computer interaction; data synthesizing for immersive HCI; generative adversarial nets; BAGAN
Online: 9 November 2018 (16:00:39 CET)
Augment reality (AR) is crucial for immersive human-computer interaction (HCI) and vision of artificial intelligence (AI). Labeled data drove object recognition in AR. However, manual annotating data is expensive and labor-intensive, and furthermore, scanty labeled data limits the application of AR. Aiming at solving the problem of insufficient training data in AR object recognition, an automated vision data synthesis method called BAGAN is proposed in this paper based on the 3D modeling and GAN algorithm. Our approach has been validated to have better performance than other methods through image recognition task on natural image database ObjectNet3D. This study can shorten the algorithm development time of AR and expand the application scope of AR, which is of great significance to immersive interactive systems.
REVIEW | doi:10.20944/preprints202309.1680.v1
Subject: Engineering, Architecture, Building And Construction Keywords: automated structural design; Building Information Modeling (BIM); design automation; generative design; interoperability; Structural Design Optimization (SDO); systematic framework
Online: 25 September 2023 (11:25:42 CEST)
Structural design optimization (SDO) plays a pivotal role in enhancing various aspects of construction projects, including design quality, cost-efficiency, safety, and structural reliability. Recent endeavors in academia and industry have sought to harness the potential of Building Information Modeling (BIM) and optimization algorithms to optimize SDO and improve design outcomes. This review paper aims to synthesize these efforts, shedding light on how SDO contributes to project coordination. Furthermore, the integration of sustainability considerations and the application of innovative technologies and optimization algorithms in SDO necessitate more interactive early-stage collaboration among project stakeholders. This study offers a comprehensive exploration of contemporary research in integrated SDO employing BIM and optimization algorithms. It commences with an exploratory investigation, employing both qualitative and quantitative analysis techniques following the PRISMA systematic review methodology. Subsequently, an open-ended opinion survey was conducted among construction industry professionals in Europe. This survey yields valuable insights into the coordination challenges and potential solutions arising from technological shifts and interoperability concerns associated with widespread SDO implementation. These preliminary steps of systematic review and industry survey furnish a robust knowledge foundation, enabling the proposal of an intelligent framework for automating early-stage sustainable structural design optimization (ESSDO) within the construction sector. The framework ESSDO addresses the challenges of fragmented collaboration between architects and structural engineers. This proposed framework seamlessly integrates with the BIM platform, i.e., Autodesk Revit for architects. It extracts crucial architectural data and transfers it to the structural design and analysis platform, i.e., Autodesk Robot Structural Analysis (RSA), for structural engineers via the visual programming tool Dynamo. Once the optimization occurs, optimal outcomes are visualized within BIM environments. This visualization elevates interactive collaborations between architects and engineers, facilitating automation throughout the workflow and smoother information exchange.
ARTICLE | doi:10.20944/preprints202307.0660.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Chat Generative Pre-Trained Transformer (ChatGPT); Artifcial Intelligence; Human-Computer Interaction; Human-AI Interaction Chatbot; App Development; Coding
Online: 11 July 2023 (09:30:02 CEST)
OpenAI has managed to turn 100 million heads in two months towards their new Language model tool (LM) ChatGPT. The third generation Generative Pretrained Transformer (GPT-3) has the capacity to tackle simple to complex and sophisticated problems, while providing reasoning behind its generated answers. ChatGPT can be used to increase productivity and improve efficiency and face challenging problems. Programming mobile applications is a challenging task that requires professional software engineers, skills and abilities to be developed. The following paper takes a case study approach to assess how novice app developers can use ChatGPT to generate Java scripts that will be used in Android studio to create a functional application. The results after, many iterations and ongoing conversations with ChatGPT managed to create an application for the anticipated function. Important insights have been drawn from the case study that could set the ground for any novice user seeking to create applications using Java scripting and Android Studio.
ARTICLE | doi:10.20944/preprints202010.0502.v1
Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: Statistical downscaling; Generative Adversarial Network; Combination of Errors; Convolutional Neural Network; multi-scale structural similarity index; Wasserstein GAN
Online: 25 October 2020 (19:33:49 CET)
Despite numerous studies in statistical downscaling methodology, there remains a lack of methods that can downscale from precipitation modeled in global climate models to regional level high resolution gridded precipitation. This paper reports a novel downscaling method using a Generative Adversarial Network (GAN), CliGAN, which can downscale large-scale annual maximum precipitation given by simulation of multiple atmosphere-ocean global climate models (AOGCM) from Coupled Model Inter-comparison Project 6 (CMIP6) to regional-level gridded annual maximum precipitation data. This framework utilizes a convolution encoder-dense decoder network to create a generative network and a similar network to create a critic network. The model is trained using an adversarial training approach. The critic uses the Wasserstein distance loss function and the generator is trained using a combination of adversarial loss Wasserstein distance, structural loss with the multi-scale structural similarity index (MSSIM), and content loss with the Nash-Sutcliff Model Efficiency (NS). The MSSIM index allowed us to gain insight into the model’s regional characteristics and shows that relying exclusively on point-based error functions, widely used in statistical downscaling, may not be enough to reliably simulate regional precipitation characteristics. Further use of structural loss functions within CNN-based downscaling methods may lead to higher quality downscaled climate model products.
ARTICLE | doi:10.20944/preprints202008.0601.v1
Subject: Social Sciences, Language And Linguistics Keywords: deep structure and surface structure; Idealized Cognitive Model (ICM); Transformational Generative Grammar (TGG); counterintuitive compound words; usage frequency
Online: 27 August 2020 (08:42:16 CEST)
This study attempts to classify compound words on the basis of Cognitive Linguistics and compares their usage trends using Computational Linguistics. Using Noam Chomsky’s concept of deep and surface structures of a sentence, Lees treated compound words, not as separate units but as a kind of embedded sentences and hinted for possible presence of deep and surface structures in compound words, which this study tries to investigate. Then on the basis of the Idealized Cognitive Model proposed by Lakoff and Fauconnier, compound words have been classified into transparent, opaque and counterintuitive compound words. Using Google Books Corpus, this study also compares their usage trends. This is done using usage frequency, defined in this work, which is analogous to productivity for affixed words calculated by G.E.Booij. Each class of compound word formed on the basis of ICM is found to have different usage frequency. The possible reasons for this are discussed.
ARTICLE | doi:10.20944/preprints202310.1906.v1
Subject: Computer Science And Mathematics, Computational Mathematics Keywords: MATLAB to Python (M-to-PY) converter; skeletonization; skeleton App; ChatGPT; generative AI; LLMs; machine learning; AI pair programming
Online: 30 October 2023 (10:06:02 CET)
The migration from MATLAB to Python (M-to-PY) has gained significant traction in recent computational research. While MATLAB has long served as a linchpin in myriad scientific endeavors, there's an emerging trend to rejuvenate these projects using Python's extensive AI tools and libraries. This study presents a semi-automated process for M-to-PY conversion, using a detailed case study of an image skeletonization project comprising fifteen MATLAB files and a 1404-image dataset. Skeletonization is foundational for ongoing 3D motion detection research using AI transformers, predominantly developed in Python. The utilization of ChatGPT-4, acting as an AI co-programmer, is pivotal in this conversion. By leveraging the public OpenAI API, we developed an M-to-PY converter prototype, evaluated its efficacy using test cases from the Bard bot, and subsequently employed the converted code in an AI application. The dual contributions encompass a well-tested M-to-PY converter and a Skeleton App capable of sketching and skeletonizing any given image, enriching the AI toolset. This study accentuates how AI resources, like ChatGPT-4, can simplify code transitions, opening doors for innovative AI implementations using primarily MATLAB-coded scientific research.
ARTICLE | doi:10.20944/preprints202109.0389.v1
Subject: Engineering, Control And Systems Engineering Keywords: Deep learning; Variational Autoencoders (VAEs); data representation learning; generative models; unsupervised learning; few shot learning; latent space; transfer learning
Online: 22 September 2021 (16:04:22 CEST)
Despite the importance of few-shot learning, the lack of labeled training data in the real world, makes it extremely challenging for existing machine learning methods as this limited data set does not represent the data variance well. In this research, we suggest employing a generative approach using variational autoencoders (VAEs), which can be used specifically to optimize few-shot learning tasks by generating new samples with more intra-class variations. The purpose of our research is to increase the size of the training data set using various methods to improve the accuracy and robustness of the few-shot face recognition. Specifically, we employ the VAE generator to increase the size of the training data set, including the basic and the novel sets while utilizing transfer learning as the backend. Based on extensive experimental research, we analyze various data augmentation methods to observe how each method affects the accuracy of face recognition. We conclude that the face generation method we proposed can effectively improve the recognition accuracy rate to 96.47% using both the base and the novel sets.
ARTICLE | doi:10.20944/preprints202308.1014.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Artificial Intelligence; Generative Pre-Trained Transformers (GPT); Intrinsically Disordered Proteins (IDPs); Large Language Models (LLMs); Pathways Language Model 2 (PaLM 2)
Online: 14 August 2023 (09:44:19 CEST)
(1) Background: Artificial Intelligence (AI) models have shown potential in various educational contexts. However, their utility in explaining complex biological phenomena, such as Intrinsically Disordered Proteins (IDPs), requires further exploration. This study empirically evaluated the performance of various Large Language Models (LLMs) in the educational domain of IDPs. (2) Methods: Four LLMs, GPT-3.5, GPT-4, GPT-4 with Browsing, and Google Bard (PaLM 2), were assessed using a set of IDP-related questions. An expert evaluated their responses across five categories: accuracy, relevance, depth of understanding, clarity, and overall quality. Descriptive statistics, ANOVA, and Tukey's honesty significant difference tests were utilized for analysis. (3) Results: The GPT-4 model consistently outperformed the others across all evaluation categories. Although GPT-4 and GPT-3.5 were not statistically significantly different in performance (p>0.05), GPT-4 was preferred as the best response in 13 out of 15 instances. The AI models with browsing capabilities, GPT-4 with Browsing and Google Bard (PaLM 2) displayed lower performance metrics across the board with statistically significant differences (p<0.0001). (4) Conclusion: Our findings underscore the potential of AI models, particularly LLMs such as GPT-4, in enhancing scientific education, especially in complex domains such as IDPs. Continued innovation and collaboration among AI developers, educators, and researchers are essential to fully harness the potential of AI for enriching scientific education.
ARTICLE | doi:10.20944/preprints202105.0401.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Deep neural networks; Disentangled representations; Attention mechanisms; Generative models; Density estimation; Out-of-distribution generalization; Numerical cognition; Visual perception; Cognitive modeling
Online: 18 May 2021 (09:50:01 CEST)
One of the most rapidly advancing areas of deep learning research aims at creating models that learn to disentangle the latent factors of variation from a data distribution. However, modeling joint probability mass functions is usually prohibitive, which motivates the use of conditional models assuming that some information is given as input. In the domain of numerical cognition, deep learning architectures have successfully demonstrated that approximate numerosity representations can emerge in multi-layer networks that build latent representations of a set of images with a varying number of items. However, existing models have focused on tasks requiring to conditionally estimate numerosity information from a given image. Here we focus on a set of much more challenging tasks, which require to conditionally generate synthetic images containing a given number of items. We show that attention-based architectures operating at the pixel level can learn to produce well-formed images approximately containing a specific number of items, even when the target numerosity was not present in the training distribution.
ARTICLE | doi:10.20944/preprints202310.0114.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: Finite element methods (FEM); cardiovascular disease; convolutional neural network (CNN); U-Net; conditional generative adversarial neural network (cGAN); stress-strain field maps
Online: 3 October 2023 (04:57:59 CEST)
Conducting computational stress-strain analysis using finite element methods (FEM) is a common approach when dealing with the complex geometries of atherosclerosis, which is a leading cause of global mortality and complex cardiovascular disease. The considerable expense linked to FEM analysis encourages the substitution of FEM with a considerably faster data-driven machine learning (ML) approach. This study investigated the potential of end-to-end deep learning tools as a more effective substitute for FEM in predicting stress-strain fields within 2D cross sections of arterial wall. We first proposed a U-Net based fully convolutional neural network (CNN) to predict the von Mises stress and strain distribution based on the spatial arrangement of calcification within arterial wall cross-sections. Further, we developed a conditional generative adversarial network (cGAN) to enhance, particularly from the perceptual perspective, the prediction accuracy of stress and strain field maps for arterial walls with various calcification quantities and spatial configurations. On top of U-Net and cGAN, we also proposed their ensemble approaches to further improve the prediction accuracy of field maps. Our dataset, consisting of input and output images, was generated by implementing boundary conditions and extracting stress-strain field maps. The trained U-Net models can accurately predict von Mises stress and strain fields, with structural similarity index scores (SSIM) of 0.854 and 0.830 and mean squared errors of 0.017 and 0.018 for stress and strain, respectively, on a reserved test set. Meanwhile, the cGAN models in a combination of ensemble and transfer learning techniques demonstrate high accuracy in predicting von Mises stress and strain fields, as evidenced by SSIM scores of 0.890 for stress and 0.803 for strain. Additionally, mean squared errors of 0.008 for stress and 0.017 for strain further support the model's performance on a designated test set. Overall, this study developed a surrogate model for finite element analysis, which can accurately and efficiently predict stress-strain fields of arterial walls regardless of complex geometries and boundary conditions.
ARTICLE | doi:10.20944/preprints202307.0192.v2
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Aviation Safety Reporting System; ASRS; Aviation Safety; Human Factors; Large Language Models; LLM; ChatGPT; Generative Language Models; GPT-3.5; aeroBERT; BERT; InstructGPT; Prompt Engineering
Online: 11 July 2023 (07:13:20 CEST)
This research investigates the potential application of generative language models, especially ChatGPT, in aviation safety analysis as a means to enhance the efficiency of safety analyses and accelerate the time it takes to process incident reports. In particular, ChatGPT was leveraged to generate incident synopses from narratives, which were subsequently compared with ground truth synopses from the Aviation Safety Reporting System (ASRS) dataset. The comparison was facilitated by using embeddings from Large Language Models (LLMs), with aeroBERT demonstrating the highest similarity due to its aerospace-specific fine-tuning. A positive correlation was observed between synopsis length and their cosine similarity. In a subsequent phase, human factor issues involved in incidents as identified by ChatGPT were compared to human factor issues identified by safety analysts. A concurrence rate of 61% was found, with ChatGPT demonstrating a cautious approach towards attributing human factor issues. Finally, the model was used to attribute incidents to relevant parties. As no dedicated ground truth column existed for this task, a manual evaluation was conducted. ChatGPT attributed the majority of incidents to the Flight Crew, ATC, Ground Personnel, and Maintenance. This study opens new avenues for leveraging AI in aviation safety analysis.