Social Sciences

Sort by

Article

Social Sciences

Language and Linguistics

Language- and Activity-Specific Associations Between English and Chinese Home Literacy Activities and Receptive Vocabulary Among Chinese–Canadian Children: A Repeated Cross-Sectional Study

Guofang Li

Fubiao Zhen

Abstract: This repeated cross-sectional study examined how English and Chinese home literacy activities were associated with receptive vocabulary among Chinese–English bilingual children in British Columbia, Canada. The study included 123 children in Grade 1 and 126 children in Grade 2. Children’s English and Chinese receptive vocabulary was assessed using standardized vocabulary measures, and parents reported the frequency of five home literacy activities in each language: book or magazine reading, computer or laptop use, television or movie watching, storytelling, and singing songs. Descriptive analyses showed that English home literacy activities were generally more frequent than Chinese activities across both grades. However, correlation and hierarchical regression results showed that Chinese receptive vocabulary was more strongly and more broadly associated with Chinese home literacy activities. In Grade 1, Chinese book or magazine reading significantly predicted Chinese receptive vocabulary, while English singing songs significantly predicted English receptive vocabulary. In Grade 2, Chinese book or magazine reading, computer or laptop use, and storytelling significantly predicted Chinese receptive vocabulary, whereas English book or magazine reading was the only significant home literacy variable for English receptive vocabulary. These findings highlight the language- and activity-specific nature of bilingual home literacy environments.

Posted: 15 July 2026

https://doi.org/10.20944/preprints202607.1039.v1

Article

Social Sciences

Language and Linguistics

Reconceptualising Reading Habits Among Pre-Service Teachers: A Structural Model of Organisation, Motivation, and Reading Intensity

Hui Geng

Dedi Irwan

Jiyang Hu

Ooi Kok Loang

Abstract: Background The habit of reading is recognised as an essential element of academic learning and professional growth in higher education, especially in teacher education programs. Teachers are expected to engage critically with academic texts and cultivate enduring reading skills that will further influence their instructional methodologies and literacy modelling in educational settings. Empirical research is increasingly showing that digital-oriented, fragmented, and requirements-driven practices characterise student reading engagement. Purpose This study seeks to investigate the framework of teachers’ reading habits by defining the underlying characteristics of reading behaviour and studying the structural interactions among these dimensions. The study aims to ascertain the interplay of organised reading practices, motivation, reading intensity, and reading choices within a cohesive explanatory framework. Materials and Methods A quantitative cross-sectional survey was used, including 1,595 pre-service teachers enrolled in teacher education programs in one Indonesian province. Data were collected using a standard questionnaire graded on a four-point Likert scale. This research was carried out in several phases, including descriptive statistics, exploratory factor analysis, Confirmatory Factor Analysis, and path analysis. Result The results showed that teachers were mostly engaged in reading digital media and episodic academic texts. Factor analysis validated a four-dimensional framework consisting of organised reading and resource management, dual motivation, reading intensity and focus, and casual reading preferences. Structural studies show that structured reading practices and motivations indirectly affect reading engagement, whereas reading intensity and focus serve as the most significant direct predictors of long-lasting reading preferences. Conclusion Research shows that teachers’ reading habits are best understood as a behavioural system mediated by structural factors. Consistent reading engagement relies more on the frequency and concentration of reading activities than motivation alone.

Posted: 08 July 2026

https://doi.org/10.20944/preprints202607.0579.v1

Article

Social Sciences

Language and Linguistics

Visual Semantics in MT Evaluation: Do Image Descriptions Help with Assessment of Multimodal MT Quality?

Sami Ul Haq

Sheila Castilho

Yvette Graham

Abstract: Multimodal machine translation aims to integrate visual context, such as images, with textual data to improve the translation of ambiguous source text. However, the evaluation of these systems still largely relies on traditional automatic metrics, which do not account for additional modalities during evaluation. The lack of dedicated evaluation methods often results in inconsistent findings and creates uncertainty regarding the actual contribution of visual context in translation. In this work, we examine the performance of state-of-the-art trained and untrained automatic evaluation metrics, particularly when comparing multimodal and text-only systems. Our evaluation focuses on whether existing metrics are sensitive enough to distinguish between multimodal and text-only machine translation systems. We further investigate whether automatically generated image descriptions can serve as effective contextual signals for improving metric sensitivity to multimodal tasks. Our results show that incorporating such visual information into supervised metrics yields better alignment with human judgments. While all metrics successfully distinguished image-aware from image-agnostic systems on general test sets, both n-gram–based and embedding-based metrics struggled on a contrastive evaluation set designed to capture context-dependent errors. Furthermore, we discuss how the presence of visual context influences human evaluators' judgments, as ratings were often substantially revised, further emphasising the critical role of context in the evaluation of multimodal machine translation systems.

Posted: 22 June 2026

https://doi.org/10.20944/preprints202606.1564.v1

Article

Social Sciences

Language and Linguistics

Bridging the Language Gap: Exploring the Divide Between Scientific Discourse and Everyday Language through Word-Cards

Anna Castaldo

Abstract: This study investigates the semantic friction between colloquial language and chemical discourse, exploring how technical terminology functions as a tool for “narrative technocracy.” Focusing on terms such as “law”, “energy”, “resonance,” the paper demonstrates how ostensibly familiar words, when adopted by science, acquire precise, often exclusionary meanings. This process can lead to either a false sense of understanding or an imperative of blind trust for non-expert audiences, thereby stabilizing expert authority. Using a qualitative analysis of a series of “Word-Cards” designed for LinkedIn, the study examines how visual mediation can navigate these interpretative contingencies. The findings suggest that deliberate linguistic reflection in science communication can mitigate semantic misalignments, transforming technical discourse from a barrier of authority into a platform for informed public engagement.

Posted: 29 May 2026

https://doi.org/10.20944/preprints202605.2044.v1

Article

Social Sciences

Language and Linguistics

The Generalized Coordinate System for Rhetorical Modes

Zi-Niu Wu

Abstract: This paper introduces the Generalized Coordinate System (GCS) as a framework for analyzing and generating rhetorical modes---the conventional patterns of discourse. The GCS is composed of low-dimensional, mediating and high-dimensional axes. The low-dimensional axes are Thing, Feature, Quantitative Attribute, Qualitative Attribute, Formal Attribute axes and form the objects or foundational elements for rhetorical modes. The mediating axes are Basic Expressive-Representational Elements and Rhetorical Mode axes and transform the raw material into communicable languages. The high-dimensional axes include Cognitive Function axis, Epistemic Purpose axis and the Five-Level Expression Staircase axis (Depth axis). The high-dimensional axes determine the cognitive depth and ultimate purpose, and capture the developmental progression of language competence- from raw perception to paradigm-shattering insight. Three types of semantic or modal mapping are defined: low-dimensional mapping (from low-dimensional axes to the mediating axes), high-dimensional mapping (from the mediating axes to high-dimensional axes), and full-dimensional mapping. These mappings form a pyramidal hierarchy, progressing from foundational elements (things, features, and attributes) to higher-order cognitive functions and epistemic purposes. By employing three core logical structures---combinatory, parallel, and embedded---the GCS consolidates infinite expressive possibilities within the finite intersections of its axes. The system's generative capacity, quantifiable by the number of axis intersections (generalized mode number), enables the navigation of nearly infinite expressive variations while steering practical applications toward finite, purpose-driven goals. The GCS transitions rhetorical modes from a static taxonomy to a dynamic analytical system for discourse construction and analysis, offering possibly insights for the development of large language models through the integration of a programmable rhetorical mode system.

Posted: 07 May 2026

https://doi.org/10.20944/preprints202602.0616.v2

Article

Social Sciences

Language and Linguistics

Mapping the Semantic Networks of Political Communication: Diachronic Transitions from Structurally Coherent to Semantically Fragmented Discourse in the Digital Era

Sophia Melanson Ricciardone

Abstract: This study examines how the semantic features of political discourse have changed since the pre-digital period and how contemporary political news media environments have become structurally organized according to fragmented meaning ecologies. Rather than examining lexical content alone, this study uses a comparative diachronic semantic network analysis using corpus-based computational discourse methods. Background: While polarization is often studied in terms of ideological distance or word frequency shifts, less is known about how the relationships between semantic domains themselves may reorganize over time, potentially affecting social cohesion and institutional trust. Methods: A comparative diachronic design was applied to political news transcripts from the 1980s and the 2020’s (both sourced via YouTube). Semantic annotation (WMatrix7), n-gram analysis, and Pearson correlation-based semantic network modeling were used to compare semantic coupling across governance, emotional, psychological, and social domains, alongside distributional statistics and functional discourse coding. Results: independent t-tests found no significant differences were found in overall semantic frequency distributions between corpora, indicating distributional stability. However, network analyses revealed a strong contrast in structure: the 1980 corpus exhibited uniformly strong positive correlations across semantic domains, reflecting a highly integrated system, whereas the 2026 corpus showed weaker, more variable, and in some cases negative correlations, strongly suggesting semantic decoupling and fragmentation. These findings were supported by n-gram and functional analyses showing increased conversationalization, negation, and disfluency in the 2026 discourse. Conclusions: Political discourse appears to have undergone a structural reorganization from a coherent, highly coupled semantic system toward a more modular and fragmented configuration, suggesting that contemporary polarization is better understood as a shift in semantic organization than as changes in lexical frequency alone.

Posted: 30 April 2026

https://doi.org/10.20944/preprints202604.2083.v1

Review

Social Sciences

Language and Linguistics

Uncovering How Social Cognitive Representations of Bilingual-Ism in the United States Can Result in Psychological Shame and Linguistic Homelessness for Transnational Youth: Reorienting Bilingualism-as-Problem to a Resource and a Right

Steve Daniel Przymus

Omar Serna-Gutiérrez

Pablo Montes

Abstract: Language is social, as it is used by individuals to communicate and exchange ideas in society. Language is also cognitive, as the primary function of language, even before communicating and exchanging ideas, is to think. This article connects the social representations of what bilingualism is in the United States and how transnational youth are talked about in U.S. society with how both of these social representations create cognitive representations (e.g., thoughts, ideas, and beliefs) about transnational youth that result in negative educational policies and practices, and shameful psychological and behavioral experiences for these youth. We begin with an ethnosemantic analysis of the word “bilingual” in the U.S. and then use the cognitive linguistic phenomena of conceptual metaphor and conceptual metonymy to explain how bilingualism is cognitively viewed as a “shameful problem” in society for transnational youth. We link linguistic shame, brought on by the social cognitive representations of bilingualism as transnational youth metonymically being incomplete, broken, in disrepair, fractured, unsettled, displaced, lacking fully built linguistic structures, not fully in possession of any language, to the phenomenon of and conceptual metaphor of TRANSNATIONAL YOUTH’S BILINGUALISM IS LINGUISTIC HOMELESSNESS (Bakhtin, 1981; Baratta, 2014; Britton, 1996). We conclude by putting forth a new metaphor, TRANSNATIONAL YOUTH FUNDS OF KNOWLEDGE ARE MYCELIAL NETWORKS, that rejects the concept of linguistic homelessness by pointing to these youth’s expanding networks of fluid languaging practices, transnational academic skills, and ever adapting identities. Through this new discourse we advocate for new ways of socially talking about transnational youth and their languaging practices that may lead to different cognitive representations of these students; reorienting bilingualism from a problem to a resource and a right.

Posted: 28 April 2026

https://doi.org/10.20944/preprints202604.1878.v1

Article

Social Sciences

Language and Linguistics

A Billion Ways to Ask a Question: A GCS-Based 10-Dimensional Framework for Inquiry Generation

Zi-Niu Wu

Abstract: Asking questions is fundamental, but without a systematic framework, it remains a matter of intuition rather than design. The Generalized Coordinate System (GCS) was initially proposed for analyzing and generating rhetorical modes. In this paper, we apply the GCS to form an inquiry design framework—the GCS-based 10-dimensional inquiry generation framework: treating a question as a coordinate point across ten axes, so that we have potentially a billion ways to ask questions. The five low-dimensional axes (Thing, Feature, Quantitative Attribute, Qualitative Attribute, Formal Attribute) determine what and how the question expresses; the two mediating axes (Basic Element, Rhetorical Mode) transform a raw inquiry into a communicable question package; the three high-dimensional axes (Cognitive Function, Epistemic Purpose, Expression Staircase) determine what mental operation, why, and at what developmental level. This GCS-based 10-dimensional inquiry generation transforms questioning from an intuitive art into a designable, transferable, and evaluable cognitive methodology, and is potentially useful in applications such as education, research, communication, and language modeling.

Posted: 13 April 2026

https://doi.org/10.20944/preprints202604.0836.v1

Article

Social Sciences

Language and Linguistics

Orthographic Depth and Spelling Development in Immersion Education: A Predictive Framework of Spelling Errors in French

Annick Comblain

Abstract: Orthographic depth varies across alphabetic writing systems and plays a central role in spelling acquisition. In immersion education, a second language (L2) is used as a language of instruction for part of the curriculum, such that learners are primarily confronted with its writing system during the initial stages of literacy development. This early exposure may shape the spelling strategies subsequently deployed in the first language (L1), which also corresponds to the dominant language of the surrounding community. This article provides a structured review of key mechanisms involved in spelling acquisition, orthographic depth, and cross-linguistic influence in bilingual and immersion contexts. On this basis, it proposes a conceptual and predictive framework specifying how the orthographic depth of the instructional language modulates spelling strategies and spelling error profiles in L1. Focusing on French-speaking pupils enrolled in immersion programmes with L2s characterised by either predominantly phonemic or opaque orthographies, the framework integrates strategy-based models of orthographic development. The model distinguishes phonological, lexical, and morphographic components of orthographic knowledge and predicts that immersion in phonemic-dominant orthographies favours phonographic dominance and regularisation patterns, whereas immersion in opaque orthographies promotes greater reliance on lexical-orthographic strategies, resulting in distinct and systematic spelling error profiles in French.

Posted: 03 April 2026

https://doi.org/10.20944/preprints202604.0293.v1

Article

Social Sciences

Language and Linguistics

AI and Data Analytics in Sustainable Financial Reporting and ESG Disclosure: A Systematic Literature Review

Percy Antonio Vilchez Olivares

Brandelt Jesús Artorga de la Cruz

Abstract: The intensification of ESG disclosure requirements under the Corporate Sustainability Reporting Directive (CSRD) and the International Sustainability Standards Board (ISSB) has increased the demand for artificial intelligence (AI) and data analytics to support large-scale sustainability reporting and verification. However, the existing academic literature remains fragmented across disciplinary domains, including natural language processing, machine learning, auditing, and regulatory compliance. This study addresses this gap through a PRISMA 2020-compliant systematic literature review of 45 peer-reviewed articles published between 2020 and 2025 and indexed in the Scopus database. The analysis combines bibliometric techniques using VOSviewer with qualitative thematic content analysis. The results reveal a rapidly expanding research field with a compound annual growth rate of 91.9%. Four major thematic dimensions emerge: (i) NLP and text mining for ESG disclosure analysis; (ii) machine learning applications for ESG scoring and corporate performance; (iii) AI-enabled ESG assurance, auditing, and governance; and (iv) regulatory frameworks and the digital transformation of sustainability reporting. The findings indicate that AI technologies are progressively transforming ESG disclosure from a predominantly narrative and self-reported practice into a data-driven and verifiable transparency system. These developments have important implications for regulators, corporate practitioners, assurance providers, and investors seeking to enhance the reliability and comparability of sustainability disclosures.

Posted: 17 March 2026

https://doi.org/10.20944/preprints202603.1378.v1

Concept Paper

Social Sciences

Language and Linguistics

WuYi. A Three-Level Cascade Architecture for Learning Chinese Radicals Through Sequential Multimodal Encoding, Narrative Chaining, and Mythological Macro-Organization

Stanislav E. Lauk-Dubitskiy

Abstract: This paper presents WuYi (五仪 "Five Rites"), a methodology for learning Chinese characters based on a three-level cascade architecture that integrates sequential multimodal encoding, inter-item narrative chaining, and a culturally grounded macro-narrative organized according to Wu Xing (五行) philosophy and classical Chinese mythology. At the micro level, five cognitive modalities—mental visualization (Fire), phonological construction (Metal), kinesthetic anchoring (Wood), theatrical episodic simulation (Earth), and graphomotor reconstruction (Water)—are activated in a prescribed sequence with explicit transition criteria, overcoming working memory limitations through temporal unfolding rather than parallel presentation. At the meso level, 2–5 radicals are linked through continuous causal narratives that simultaneously serve discriminative, compositional, and retrieval functions. At the macro level, the entire corpus of 214 Kangxi radicals is distributed across a two-cycle mythological structure—Cosmogonic Cycle (74 radicals) and Legendary Cycle (131 radicals)—each traversing five Wu Xing phases aligned with the canonical mythology of Nüwa, Shennong, Huangdi, and Fuxi.The methodology introduces several novel mechanisms: synthetic narrative calligrams that encode tonal contours through typographic modulation; chimeric tone spirits that bind homophonic morphemes across all four tones into single mnemonic characters; deferred mnemonic anchors that create proactive facilitation through spreading activation; and narrative-aligned primary encoding with autonomous fallback mnemonics activated through self-diagnosis. The theoretical framework integrates dual coding theory, levels of processing, embodied and situated cognition, cognitive load theory, the SPT effect, hierarchical retrieval cues, and narrative transportation theory. A between-subjects experimental protocol (N=90, three groups) for controlled validation is provided. No prior work was found that combines sequential multimodal cascade encoding, inter-item narrative chaining, or mythological macro-organization of character curricula.

Posted: 17 March 2026

https://doi.org/10.20944/preprints202603.1303.v1

Article

Social Sciences

Language and Linguistics

Redefining Linguistics: The Law of the Trio as a Universal Framework in Dialogue with Major Theories

Tedros Kifle Tesfa

Abstract: This study advances the Law of the Trio as a universal law of linguistics, positing that reality, thought, and language are ontologically equivalent yet formally distinct modalities of existence. Unlike prior frameworks that isolate language as computation, code, or communicative tool, the Trio establishes a foundational architecture: the recursive coupling of entity and state/behavior, enriched by layered modifiers. Sentences are reframed as semantic DNA, encoding identity, transformation, and relational depth across modalities. To formalize this claim, the paper introduces EMi/VMi,j notation, where i indexes modifier type and j denotes recursion depth. Worked examples and cross‑linguistic analysis (English, Korean, Basque) confirm semantic invariance across typologically distinct languages. Direct mapping to event semantics and thematic roles highlights both alignment and innovation, with recursion depth providing a computable dimension absent from existing models. Comparative analysis shows how the Trio consolidates and extends generative grammar, cognitive science, pedagogy, and semiotics by resolving their limitations through recursive semantic geometry. Applications in pedagogy and natural language processing demonstrate practical relevance. By restructuring linguistics into semantic geometry, the Trio offers a testable, falsifiable, and universal law of language that unifies theory and practice.

Posted: 17 March 2026

https://doi.org/10.20944/preprints202508.0748.v3

Article

Social Sciences

Language and Linguistics

Behavioral vs. Verbal Methods in Translation Quality Evaluation: A Cognitive Experimental Study

Xin Huang

Xiang Zhang

Abstract: This study explores the sensitivity differences between behavioral experiments and verbal reports in translation quality evaluation. Results indicate that behavioral metrics (e.g., response times) are significantly more sensitive to syntactic-pragmatic manipulations (phrase order) than verbal reports. Translations with congruent phrase order received higher ratings and faster response times compared to those with incongruent order. However, most participants explicitly denied phrase order's influence in verbal reports. Lexical equivalence showed no significant impact on explicit ratings but increased cognitive effort, as indicated by slower response times for approximate lexical matches. These findings reveal a critical dissociation between implicit cognitive processes and explicit awareness in translation evaluation. The study highlights that translation assessment involves both implicit System 1 processes and explicit System 2 reasoning, offering new cognitive insights for translation research and practical implications for translator education and machine translation assessment. By bridging cognitive science and translation studies, this research contributes to a paradigm shift: translation quality is not merely what evaluators say it is, but what their cognitive behavior reveals it to be.

Posted: 16 March 2026

https://doi.org/10.20944/preprints202603.1087.v1

Article

Social Sciences

Language and Linguistics

Morphosyntactic Integration of Single-Word Anglicisms in Border Mexican Spanish

Ruben Roberto Peralta-Rivera

Rafael Saldívar-Arreola

Abstract: Loanword Research on Anglicisms has largely centered on lexical borrowing and phonological adaptation, with comparatively limited attention to morphosyntactic integration in recipient grammars. This study examines the syntactic behavior of single-word Anglicisms in Mexican Spanish, drawing on phonetically classified corpora of 131 monosyllabic Anglicisms with mon-ophthongs extracted from spontaneous speech by Spanish–English bilinguals in the Tijuana–San Diego border region. Building on prior acoustic analyses based on F1 and F2 vowel measure-ments, the study investigates the relationship between phonological adaptation and morphosyn-tactic integration. Results reveal a gradient pattern of incorporation. Anglicisms exhibiting Span-ish-like phonetic properties tend to occupy canonical syntactic positions and show greater com-patibility with Spanish functional morphology, whereas phonetically non-adapted forms more frequently resist morphological marking and display island-like behavior within otherwise Spanish clauses. The analysis examines distribution across nominal, adjectival, and prepositional domains, as well as object positions, enabling a fine-grained assessment of degrees of morpho-syntactic integration. The former is illustrated as follows: (1) Guardo cash ([kaʃ]) por si acaso (2) Si hacen match ([mæʧ]), puede funcionar Adopting a usage-based and contact-oriented perspective for syntactic borrowing (Bybee, 2015), the study is situated within the Matrix Language Frame model (Myers-Scotton, 1993; Muysken, 2000) and recent approaches to insertional borrowing (Poplack & Dion, 2012; Onysko & Win-ter-Froemel, 2011). A central contribution lies in establishing a principled link between morpho-syntactic behavior and an independently motivated phonetic classification, offering convergent evidence for the systematic integration of Anglicisms into Spanish grammar. At a broader ana-lytical level, the study advances debates on syntactic borrowing and contact-induced change by demonstrating that Anglicisms are subject to Spanish morphosyntactic constraints rather than functioning as unconstrained lexical insertions, and by developing an interface-based account of borrowing that captures the gradient nature of grammatical incorporation in contact settings and contributes a corpus-based, empirically grounded perspective to typologies of borrowing in Spanish contact linguistics.

Posted: 13 March 2026

https://doi.org/10.20944/preprints202602.1797.v1

Article

Social Sciences

Language and Linguistics

Using Translog-II for Conducting Keylogging Experiments

Longhui Zou

Michael Carl

Abstract: Translation process research (TPR) relies on objective behavioral data to uncover the cognitive mechanisms underlying translation. This paper provides a comprehensive methodology for using Translog-II, a specialized tool for recording user activity data during translation tasks. We outline the complete experimental workflow—from project configuration to data collection—demonstrated through an English-to-Chinese translation-from-scratch case study. The study details the integration of Translog-II with the CRITT Translation Process Research Database (TPR-DB) to facilitate advanced post-processing. Key technical challenges are addressed, specifically the complexities of keystroke-to-word mapping for logo-graphic scripts requiring Input Method Editors (IMEs). We further demonstrate automated alignment protocols, multidimensional error annotation, and data visualization techniques utilizing Python scripts and Shiny R interfaces. The results indicate that while automated mapping is generally robust, specific technical noise, particularly regarding long deletions, can be mitigated through systematic analysis. Ultimately, this protocol establishes a reproducible framework for exploring translator behavior, enhancing the precision of data-driven insights into cognitive translation processes.

Posted: 13 March 2026

https://doi.org/10.20944/preprints202603.1043.v1

Article

Social Sciences

Language and Linguistics

Artificial Intelligence and Academic Honesty: Challenges in the Digital Classroom

Taylor Smith Heathen

Abstract: The integration of artificial intelligence (AI) in digital classrooms has introduced both opportunities and challenges for academic honesty. This narrative review study explored how AI tools influence students’ learning behaviors, assessment practices, and ethical decision-making in academic tasks. Data were collected from students and educators through surveys, interviews, and document analysis, focusing on AI-assisted writing, digital platforms, and institutional policies. Findings reveal that while AI can enhance learning efficiency and engagement, it also blurs the boundary between legitimate academic support and misconduct. Many students perceive AI use as similar to peer assistance, resulting in uncertainty regarding ethical practices. Lower proficiency students were particularly prone to reliance on AI-generated outputs, highlighting the need for targeted instructional support. Traditional assessment formats, such as essays and take-home assignments, were identified as vulnerable to AI misuse, prompting calls for process-oriented evaluations, reflective tasks, and in-class assessments. The study also emphasizes the importance of clear institutional policies and AI literacy programs in promoting responsible use. Moreover, emerging technological risks, including deepfake content, underscore the necessity of proactive guidance and monitoring. Overall, the research suggests that fostering academic integrity in AI-mediated classrooms requires a balanced approach, combining ethical education, innovative pedagogy, and policy development. By cultivating transparency, critical thinking, and responsible AI engagement, institutions can maximize AI’s educational benefits while safeguarding authenticity and integrity in student work.

Posted: 28 February 2026

https://doi.org/10.20944/preprints202602.2041.v1

Short Note

Social Sciences

Language and Linguistics

Linguistic Misrepresentation in Pandemic Terminology: A Cognitive–Linguistic Critique of ‘Small Gatherings Cancellation’

Soheil Daneshzadeh

Abstract:

This article identifies a terminological misrepresentation in the expression ‘small gatherings cancellation’—ranked by Haug et al. (2020) as the most effective non-pharmaceutical intervention during the COVID-19 pandemic. Corpus-based and theoretical analyses demonstrate that small gathering conventionally denotes a planned or spontaneous social event, whereas the predicate cancellation reinforces this event-based frame. Consequently, the phrase fails to capture the intended reference to restrictions on simultaneous presence in commercial or professional settings. Drawing on cognitive-linguistic theory and institutional usage from the WHO and CDC, this paper shows how such misrepresentation may trigger unintended conceptual frames, leading to interpretive ambiguity in both scholarly and policy contexts. Three alternatives are proposed to achieve better semantic alignment and enhance terminological precision and communicative clarity in future public-health discourse.

Abstract:

Posted: 05 February 2026

https://doi.org/10.20944/preprints202103.0698.v3

Article

Social Sciences

Language and Linguistics

Near-Merger and Contextual Sensitivity in the Perception of /n–l/ in Sichuan Mandarin

Minghao Zheng

Allen Shamsi

Ratree Wayland

Abstract: Background/Objectives: Sichuan Mandarin is often described as exhibiting overlap or merger between word-initial /n/ and /l/, but perceptual sensitivity across phonetic contexts remains underexplored. This study examines whether perception of the /n–l/ contrast varies by vowel context and listener experience. Methods: Thirty-two Sichuan Mandarin listeners completed categorical identification and same–different AX discrimination tasks using seven-step /n/→/l/ continua derived from native-speaker productions in /i/ and /a/ contexts. Sensitivity, response bias, accuracy, and response times were analyzed alongside individual differences. Acoustic properties of the stimuli were quantified using spectral and amplitude-based measures. Results: Listeners showed overall reduced sensitivity to the /n–l/ contrast, with substantially stronger perceptual differentiation in /i/ than /a/ context. Bias patterns were comparable across contexts, indicating sensitivity-driven effects. Acoustic analyses showed more robust cue structure in the /i/ continuum. Age, education, and Standard Mandarin experience modulated response efficiency but did not eliminate the vowel asymmetry. Conclusions: Results support a context-dependent near-merger of /n/ and /l/, shaped by acoustic cue availability and experience-based cue exploitation.

Posted: 23 January 2026

https://doi.org/10.20944/preprints202601.1767.v1

Article

Social Sciences

Language and Linguistics

Comparing Different Physics Fields Using Statistical Linguistics

María Fernanda Sánchez-Puig

Carlos Gershenson

Carlos Pineda

Abstract: The large digital archives of the American Physical Society (APS) offer an opportunity to quantitatively analyze the structure and evolution of scientific communication. In this paper, we perform a comparative analysis of the language used in eight APS journals (Phys. Rev. A, B, C, D, E, Lett., X, Rev. Mod. Phys.) using methods from statistical linguistics. We study word rank distributions (from monograms to hexagrams), finding that they are consistent with Zipf’s law. We also analyze rank diversity over time, which follows a characteristic sigmoid shape. To quantify the linguistic similarity between journals, we use the rank-biased overlap (RBO) distance, comparing the journals not only to each other, but also to corpora from Google Books and Twitter. This analysis reveals that the most significant differences emerge when focusing on content words rather than the full vocabulary. By identifying the unique and common content words for each specialized journal, we develop an article classifier that predicts a paper’s journal of origin based on its unique word distribution. This classifier uses a proposed “importance factor” to weigh the significance of each word. Finally, we analyze the frequency of mention of prominent physicists and compare it to their cultural recognitions ranked in the Pantheon dataset, finding a low correlation that highlights the context-dependent nature of scientific fame. These results demonstrate that scientific language itself can serve as a quantitative window into the organization and evolution of science.

Posted: 13 January 2026

https://doi.org/10.20944/preprints202601.0989.v1

Article

Social Sciences

Language and Linguistics

Perceptual (Static) Active Inference Approach to the Superior Production Effect of Speaking over Writing: An Experiment and Computational Model Report

Roberto Limongi

Oluwagbemisola Oguntoye

Angelica Silva

Abstract: This paper reports a cognitive psychology experiment and a Markov decision process (MDP) model of the production effect—higher memory retrieval that follows speaking aloud or writing/typing words, as opposed to lower memory retrieval when words are read silently. Current models of the production effect draw on the global-matching framework of memory. We identify four limitations of these models and present a MDP model (a perceptual active inference model) to causally explain a superior production effect of speaking over writing. University students performed a word-production task comprising speaking and writing conditions, followed by a memory test. The results showed main effects of condition on accuracy and response times. The MDP model indicated higher sensory precision during memory retrieval in the speaking condition than in the writing condition. Through Bayesian model selection, we evaluated whether the MDP model, as a mechanistic active-inference model, provided higher construct validity than a descriptive linear model (fit via Variational Laplace). The MDP model outperformed the linear model, suggesting that production modalities are hidden states that cause the visual sensory observation of words that had been linguistically produced. Crucially, the MDP model explains both group effects and individual variability, confirming the reliability paradox of statistical models.

Posted: 16 December 2025

https://doi.org/10.20944/preprints202512.1253.v1

of 8

Language- and Activity-Specific Associations Between English and Chinese Home Literacy Activities and Receptive Vocabulary Among Chinese–Canadian Children: A Repeated Cross-Sectional Study

Reconceptualising Reading Habits Among Pre-Service Teachers: A Structural Model of Organisation, Motivation, and Reading Intensity

Visual Semantics in MT Evaluation: Do Image Descriptions Help with Assessment of Multimodal MT Quality?

Bridging the Language Gap: Exploring the Divide Between Scientific Discourse and Everyday Language through Word-Cards

The Generalized Coordinate System for Rhetorical Modes

Mapping the Semantic Networks of Political Communication: Diachronic Transitions from Structurally Coherent to Semantically Fragmented Discourse in the Digital Era

Uncovering How Social Cognitive Representations of Bilingual-Ism in the United States Can Result in Psychological Shame and Linguistic Homelessness for Transnational Youth: Reorienting Bilingualism-as-Problem to a Resource and a Right

A Billion Ways to Ask a Question: A GCS-Based 10-Dimensional Framework for Inquiry Generation

Orthographic Depth and Spelling Development in Immersion Education: A Predictive Framework of Spelling Errors in French

AI and Data Analytics in Sustainable Financial Reporting and ESG Disclosure: A Systematic Literature Review

WuYi. A Three-Level Cascade Architecture for Learning Chinese Radicals Through Sequential Multimodal Encoding, Narrative Chaining, and Mythological Macro-Organization

Redefining Linguistics: The Law of the Trio as a Universal Framework in Dialogue with Major Theories

Behavioral vs. Verbal Methods in Translation Quality Evaluation: A Cognitive Experimental Study

Morphosyntactic Integration of Single-Word Anglicisms in Border Mexican Spanish

Using Translog-II for Conducting Keylogging Experiments

Artificial Intelligence and Academic Honesty: Challenges in the Digital Classroom

Linguistic Misrepresentation in Pandemic Terminology: A Cognitive–Linguistic Critique of ‘Small Gatherings Cancellation’

Near-Merger and Contextual Sensitivity in the Perception of /n–l/ in Sichuan Mandarin

Comparing Different Physics Fields Using Statistical Linguistics

Perceptual (Static) Active Inference Approach to the Superior Production Effect of Speaking over Writing: An Experiment and Computational Model Report

MDPI Initiatives

Important Links

Subscribe