REVIEW | doi:10.20944/preprints202110.0247.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: natural language; NLP; Korean; dataset
Online: 18 October 2021 (14:33:41 CEST)
English based datasets are commonly available from Kaggle, GitHub, or recently published papers. Although benchmark tests with English datasets are sufficient to show off the performances of new models and methods, still a researcher need to train and validate the models on Korean based datasets to produce a technology or product, suitable for Korean processing. This paper introduces 15 popular Korean based NLP datasets with summarized details such as volume, license, repositories, and other research results inspired by the datasets. Also, I provide high-resolution instructions with sample or statistics of datasets. The main characteristics of datasets are presented on a single table to provide a rapid summarization of datasets for researchers.
ARTICLE | doi:10.20944/preprints202208.0325.v1
Subject: Arts & Humanities, Linguistics Keywords: Hallyu; Korean Wave; Korean language; Tagalog language; Second Language Acquisition; Vocabulary Acquisition; Proto-Lexicon
Online: 17 August 2022 (12:42:10 CEST)
Hallyu or the Korean wave has sweeped through nations especially the Philippines. This exploratory study offers a dip into the effects of the Hallyu wave on a Filipino speaker through ambient exposure, consequently on the development of a Korean proto-lexicon through indirect vocabulary acquisition. Finally, a focus group discussion and a preliminary assessment tested out the waters of the effects of Hallyu on casual Filipino speakers. A thorough statistically comprehensive qualitative study on the acquisition framework is recommended to provide substantial evidence to support the framework.
ARTICLE | doi:10.20944/preprints202201.0018.v1
Subject: Mathematics & Computer Science, General & Theoretical Computer Science Keywords: NMT Evaluation, Meta-Evaluation, SacreBLEU, Korean
Online: 4 January 2022 (20:24:43 CET)
SacreBLEU, by incorporating a text normalizing step in the pipeline, has been well-received as an automatic evaluation metric in recent years. With agglutinative languages such as Korean, however, the metric cannot provide a conceivable result without the help of customized pre-tokenization. In this regard, this paper endeavors to examine the influence of diversified pre-tokenization schemes –word, morpheme, character, and subword– on the aforementioned metric by performing a meta-evaluation with manually-constructed into-Korean human evaluation data. Our empirical study demonstrates that the correlation of SacreBLEU (to human judgment) fluctuates consistently by the token type. The reliability of the metric even deteriorates due to some tokenization, and MeCab is not an exception. Guiding through the proper usage of tokenizer for each metric, we stress the significance of a character level and the insignificance of a Jamo level in MT evaluation.
ARTICLE | doi:10.20944/preprints202205.0007.v1
Subject: Behavioral Sciences, Other Keywords: IPA; Covid-19; Health Perceptions; Korean & Japanese Adolescents
Online: 4 May 2022 (12:26:53 CEST)
This study aims to comparatively analyse the importance and performance of the health of Korean and Japanese adolescents during the prolonged coronavirus disease 2019 (COVID-19) pandemic. Data were collected from 1,341 sampled Korean and Japanese adolescents in September 2021 through online and offline surveys. The collected data were analysed with frequency analysis, reliability testing, t-test, and importance-performance analysis (IPA). The following results were obtained. First, adolescents in the two countries perceive various factors about health as important during the COVID-19 pandemic, but their performance is weak compared to their perceived importance. Second, Korean adolescents had greater perceived importance for all factors of health perception compared to their Japanese counterparts. Third, the difference in performance between Korean and Japanese adolescents was especially evident for ‘hygiene management’, and there were significant differences in performance in ‘disease management’ and ‘physical activity’. Fourth, in quadrant 4 of the IPA matrix, there were similarities and differences in a particular factor of health perception between Korean and Japanese adolescents. Based on these results, we proposed measures to emphasise the importance of health and enhance performance among Korean and Japanese adolescents.
ARTICLE | doi:10.20944/preprints202111.0071.v1
Subject: Medicine & Pharmacology, Nutrition Keywords: Macronutrient intake; Metabolic syndrome; the Korean Health Examinee (HEXA) stud
Online: 3 November 2021 (09:07:55 CET)
Macronutrient intake is important in the prevention and management of Metabolic Syndrome (MetS). We characterized energy and macronutrient intake of Koreans diagnosed with MetS at recruitment of the Health Examinee (HEXA) cohort. We included 130,423 participants aged 40-69 years for analysis. Odds ratios (OR) and 95% confidence intervals (CI) were estimated to evaluate the intake of macronutrient. Low energy (OR= 0.94, 95% CI: 0.89-0.98) and fat intake (OR=0.91, 95% CI: 0.86-0.97) was observed among 50-59 year-old men. Only postmenopausal women had lower intake of total energy (OR= 0.95, 95% CI: 0.92-0.97) while low fat intake was observed in all women (OR= 0.80 95% CI: 0.77-0.83). For carbohydrate intake, the OR were 1.14 (95% CI: 1.08-1.22) and 1.17 (95% CI: 1.08-1.27) among women in their 50’s and 60’s respectively. Protein intake was low (OR=0.90, 95% CI: 0.86-0.95) and (OR=0.88, 95% CI: 0.82-0.94) among women in their 50’s and 60’s respectively. High intake of plant carbohydrates in women (OR=1.16, 95% CI: 1.12-1.20) and plant protein in both genders (OR=1.09, 95% CI: 1.05-1.13) were observed, but low intake of total energy, fat and animal-source carbohydrates in both genders. Fat intake was low regardless of food source. In conclusion, high consumption of plant-source and low consumption of animal-source macronutrients was observed in Korean adult diagnosed MetS.
ARTICLE | doi:10.20944/preprints202104.0731.v1
Subject: Social Sciences, Accounting Keywords: COVID-19; crisis management; Korean fitness center; Importance–Performance Analysis
Online: 28 April 2021 (07:47:57 CEST)
: The purpose of this research was to verify the importance and performance of crisis management in Korean fitness center using Importance-Performance Analysis (IPA). For this study, 304 fitness center executives and managers in Seoul and Gyeonggi region were selected as from March 21 to May 17, 2020. Frequency analysis was performed using SPSS 24.0 and exploratory factor analysis was conducted to verify the validity and reliability. Priority analysis and IPA analysis were performed to compare the mean values, and the following results were obtained. First, in the first quadrant, there were 6 attributes besides keep social distancing between employees and customers. Second, in the second quadrant, there were four attributes in addition to regular disinfection of the gymnasium. Third, in the third quadrant, there were 6 attributes besides maintain furniture clean. Third, there were 6 attributes other than maintaining furniture clean in the third quadrant. Fourth, in the fourth quadrant, there are three attributes in addition to the restriction of face-to-face meetings. The conclusion is as follows. First, equip supplies for prevention of covid19, keep social distance, and check government support policies. Second, analyze of economic support policies, and research on their application methods. Third, Prepare various non-face-to-face communication methods and Untact(non-contact) marketing strategies. Fourth, make a checklist for factors with relatively little importance.
ARTICLE | doi:10.20944/preprints202101.0060.v1
Subject: Biology, Plant Sciences Keywords: Korean soybean varieties; nsSNP; Biomarker; SIFT; Polyphen; PANTHER; I-mutant 2.0
Online: 4 January 2021 (16:25:22 CET)
Soybean is a highly nutritious legume grown globally as a food and feed crop. An examination of a collection of 10 cultivated and 6 wild Korean soybean varieties showed that there is phenotypic variability notable in different soybeans. Therefore, to develop a list of biomarker candidates useful for growing soybeans of better quality and quantity, the genes of 16 Korean soybean varieties were compared with those of the reference Glycine max var. Williams 82. The comparison was made through gene sequencing to facilitate selection of nsSNPs. The objective of the study was to find out the structural and functional variations caused by nsSNPs and discuss whether the collection of Korean soybean varieties qualifies as biomarkers based on their phenotypic traits. Analysis of the data collected was done using four software: SIFT, Polyphen, PANTHER, and I-mutant 2.0, which are designed to detect the rate of functional and structural variations caused by the nsSNPs in cultivated and wild soybean varieties. Genotypic information obtained in the analysis was used to develop a core collection of biomarkers based on whether nsSNP content was found in more than half of the 16 samples. Therefore, the list of biomarker candidates developed from this study showed that Korean soybean could provide valuable information needed in both future crop genetic research and identification of biomarkers.
Subject: Medicine & Pharmacology, Oncology & Oncogenics Keywords: traditional korean medicine; hippocampus; neuronal cell death; oxidative stress; medicinal herbs
Online: 10 November 2019 (14:53:14 CET)
Incident rates of neurodegenerative diseases have steadily increased globally, but there is no therapeutic access available. We newly prescribed medicinal herbal remedy including five different herbal plants called, Chen-Ma-Dan-Sam-Ga-Mi-Bang (CMST), purposed to prove for pharmacological properties and corresponded actions on hippocampus neuronal cell injury by hypoxia-induced mice model. Mice were adapted to normoxia or hypoxia with or without CMST for 5 days. We gathered pharmacological effects of CMST on cell injury by enhancement of dihydroethidium and 4-hydroxynonenal signals which were correlated with abnormal redox status in the protein or gene expression levels (abnormal elevations of nitric oxide, reactive oxygen species, lipid peroxidation and deteriorations of total glutathione, total antioxidant capacity, and activities of superoxide dismutase and catalase) due to hypoxia. CMST also notably exerted to attenuates molecules for neuronal cell injury markers such as p-tau, cleaved caspase-3 due to DNA oxidations (53bp1and phosphor-histone H2AX), inflammatory cytokines, and hemeoxigenase-1. We further figured out the underlying actions of CMST by in vitro experiment through inactivation of microglial cell which can mediate neuronal cell injury. Collectively, CMST prevented from hippocampal neuronal cells via inactivation of microglial cell with normalization of redox status on hypoxia-induced hippocampus neuronal cell injury.
ARTICLE | doi:10.20944/preprints201810.0518.v1
Subject: Earth Sciences, Environmental Sciences Keywords: Korean fir; hierarchical regression model; climate change; seedling survival; dwarf bamboo; drought
Online: 23 October 2018 (05:00:38 CEST)
Regional declines of the Korean fir (Abies koreana) have been observed since the 1980s on the subalpine region. To explain this decline, it is fundamental to investigate the degree to which environmental factors have contributed to plant distributions on diverse spatial scales. We applied a hierarchical regression model to determine quantitatively the relationship between the abundance of Korean fir (seedlings) and diverse environmental factors across two different ecological scales. We measured Korean fir density and the occurrence of its seedlings in 102 (84) plots nested at five sites and collected a range of environmental factors at the same plots. Our model included hierarchical explanatory variables at both site-level (weather conditions) and plot-level (micro-topographic factors, soil properties, and competing species). The occurrence of Korean fir seedlings was positively associated with moss cover and rock cover but negatively related to dwarf bamboo cover. On site-level, winter precipitation was significantly positively related to the occurrence of seedlings. A hierarchical Poisson regression model revealed that Korean fir density were negatively associated with slope aspect, topographic position index, Quercus mongolica cover, and mean summer temperature. Our results suggest that drought and competition with other species are factors which halt the survival of Korean fir. We can predict that the population of Korean fir will continue to decline on the Korean Peninsula due to rising temperatures and seasonal drought, and only a few Korean fir will survive on northern slopes or valleys where competition with dwarf bamboo and Q. mongolica can be avoided.
ARTICLE | doi:10.20944/preprints202104.0217.v1
Subject: Life Sciences, Biochemistry Keywords: full-length coding region sequence; HIV-1; Korean subclade B; sequence length; hemophilia; evolution
Online: 7 April 2021 (17:26:03 CEST)
We aimed to investigate whether the sequence length of HIV-1 increases over time. A longitudinal analysis of full-length coding region sequences (FLs) during an HIV-1 outbreak among pa-tients with hemophilia and local controls infected with the Korean subclade B of HIV-1 (KSB) was performed. Genes were amplified by overlapping RT-PCR or nested PCR and subjected to direct sequencing. Overall, 141 FLs were sequentially determined over 30 years in 62 KSB-infected patients. Phylogenetic analysis indicated that within KSB, two FLs from plasma donors O and P comprised two clusters together with 8 and 12 patients with hemophilia, respectively. Signature pattern analysis for the KSB of HIV-1 revealed 91 signature nucleotide residues (1.05%). In total, 48 and 43 signature nucleotides originated from clusters O and P, respectively. Only six positions contained 100% specific nucleotide(s) in clusters O and P. Additionally, in-depth FL analysis over 30 years indicates that the KSB FL significantly increased over time before combined antiretroviral therapy (cART) and decreased with cART. The increase occurred due to a significant increase in env and nef genes, originating in the variable regions of both genes. The increase in the sequence length of HIV-1 over time suggests that it has an evolutionary direction.
ARTICLE | doi:10.20944/preprints202102.0575.v1
Subject: Life Sciences, Biochemistry Keywords: full-length coding region sequence; HIV-1; Korean subclade B; sequence length; hemophilia; evolution
Online: 25 February 2021 (10:31:36 CET)
The objective of this study is to investigate whether the sequence length of HIV-1 increases over time. A longitudinal analysis of full-length coding region sequences (FLs) in an outbreak of HIV-1 infection among patients with hemophilia and local controls identified as infected with the Korean subclade B of HIV-1 (KSB). Genes amplified by overlapping RT-PCR or nested PCR were subjected to direct sequencing. In total, 141 FLs were sequentially determined over 30 years in 62 KSB-infected patients. Non-KSB sequences were retrieved from the Los Alamos National Laboratory HIV Database. Phylogenetic analysis indicated that within KSB, 2 FLs from plasma donors O and P comprised two clusters together with 8 and 12 patients with hemophilia, respectively. Signature pattern analysis for the KSB of HIV-1 revealed signature nucleotide residues at 1.05%, compared with local controls. Additionally, in-depth FLs sequence analysis over 30 years in KSB indicates that the KSB FL significantly increases over time before combined antiretroviral therapy (cART) and decreases on cART. Furthermore, the increase in FLs over time significantly occurred in the subtypes B, C and G, but, there was no increase in the subtypes D, A, and F1. Consequently, subtypes F1 and D had the shortest sequence length. Our analysis was extended to compare HIV-1 with HIV-2 and SIVs. Essentially, the longer the sequence length (SIVsm > HIV-2 > SIVcpz > HIV-1), the longer the survival period. The increase in the length of the HIV-1 sequence over time suggests that it might be an evolutionary direction toward attenuated pathogenicity.
ARTICLE | doi:10.20944/preprints202111.0254.v1
Subject: Engineering, Construction Keywords: Korean Heritage; Asian Architecture; Tadao Ando; Hypogeal Chambers; Architectural Proportions; Ashlar Construction; Innovative Architectural Projects.
Online: 15 November 2021 (11:23:12 CET)
The purpose of this article is to disclose the architectural proportions and nature of the Korean national treasure in Seokguram Grotto, Gyeongju. The authors compare its features with those of other ancient hypogeal or ashlar constructions and intend to rediscover its relevant hidden configuration and latent structural properties to show its uniqueness. The methods employed in the research belong initially to architectural design and composition to advance at a later stage, into the nuances of stone masonry, lighting effects or cohesive construction. In this discussion and thorough analysis, different philosophical and scientific subtleties come afloat. The results demonstrate a significant potential capable to be applied in part to recent architectural developments like Tadao Ando’s Buddha Hill in Hokkaido (2017) and the authors’ own project for a Buddhist monument.
ARTICLE | doi:10.20944/preprints201803.0103.v1
Subject: Engineering, Energy & Fuel Technology Keywords: high-performance buildings; energy-saving technology; primary energy consumption; CO2 emission; Korean climate; EnergyPlus; reference building
Online: 14 March 2018 (10:05:02 CET)
This study aims to suggest a basis for the selection of technologies for developing high-performance buildings to reduce energy consumptions and greenhouse gas emissions. Energy-saving technologies comprising of 15 cases were categorized into passive, active, and renewable energy systems. EnergyPlus v8.8 was used to analyze the contribution of each technology in reducing the primary energy consumptions and CO2 emissions in the Korean climate. The primary energy consumptions of the base model were 464.1 and 485.1 kWh/m²a in the Incheon and Jeju, respectively, and the CO2 emissions were 83.4 and 87.4 kgCO2/m²a, respectively. Each technology (cases 1–15) provided different energy-saving contributions in the Korean climate depending on their characteristics. The heating, cooling, and other energy-saving contributions of each technology indicate that their saving rates can be used when selecting suitable technologies during the cooling and heating seasons. Case 15 (active chilled beam with dedicated outdoor air system + ground source heat pump) showed the highest energy saving rate. In case 15, the Incheon and Jeju models were reduced by 189.4 (59.2%) and 206.2 kWh/m²a (57.4%) compared to the base case, respectively, and the CO2 emissions were reduced by up to 32.7 (60.8%) and 35.6 kgCO2/m²a (59.3%), respectively.
BRIEF REPORT | doi:10.20944/preprints202002.0424.v1
Subject: Keywords: SARS-CoV-2; Korean COVID-19; Wuhan Corona virus; real time PCR Ct Value; Sensitivity; False Negative
Online: 28 February 2020 (12:03:13 CET)
Since mid-December of 2019, coronavirus disease 2019 (COVID-19) has been spreading from Wuhan, China. As of February 21, total 75,773 confirmed cases worldwide have spread to more than two dozen countries. Transmission of COVID-19 can occur early in the course of infection since SARS-CoV-2 viral loads in asymptomatic patients are similar to that in the symptomatic patients. Therefore, more sensitive diagnostic methods are needed to detect early phase of the infection to prevent secondary or tertiary spreads. Here, we compare the RT-PCR confirmatory test results using two different SARS-CoV-2 viral RNAs from two Korean COVID-19 confirmed cases.RT-PCR method targeting the RdRP gene, which was recommended by WHO guideline, was less sensitive than targeting N genes (as per CDC guideline). Because many countries follow the WHO guideline, our findings may contribute to the early diagnosis of COVID-19.