Social Sciences

Sort by

Article
Social Sciences
Language and Linguistics

María Fernanda Sánchez-Puig

,

Carlos Gershenson

,

Carlos Pineda

Abstract: The large digital archives of the American Physical Society (APS) offer an opportunity to quantitatively analyze the structure and evolution of scientific communication. In this paper, we perform a comparative analysis of the language used in eight APS journals (Phys. Rev. A, B, C, D, E, Lett., X, Rev. Mod. Phys.) using methods from statistical linguistics. We study word rank distributions (from monograms to hexagrams), finding that they are consistent with Zipf’s law. We also analyze rank diversity over time, which follows a characteristic sigmoid shape. To quantify the linguistic similarity between journals, we use the rank-biased overlap (RBO) distance, comparing the journals not only to each other, but also to corpora from Google Books and Twitter. This analysis reveals that the most significant differences emerge when focusing on content words rather than the full vocabulary. By identifying the unique and common content words for each specialized journal, we develop an article classifier that predicts a paper’s journal of origin based on its unique word distribution. This classifier uses a proposed “importance factor” to weigh the significance of each word. Finally, we analyze the frequency of mention of prominent physicists and compare it to their cultural recognitions ranked in the Pantheon dataset, finding a low correlation that highlights the context-dependent nature of scientific fame. These results demonstrate that scientific language itself can serve as a quantitative window into the organization and evolution of science.

Article
Social Sciences
Language and Linguistics

Roberto Limongi

,

Oluwagbemisola Oguntoye

,

Angelica Silva

Abstract: This paper reports a cognitive psychology experiment and a Markov decision process (MDP) model of the production effect—higher memory retrieval that follows speaking aloud or writing/typing words, as opposed to lower memory retrieval when words are read silently. Current models of the production effect draw on the global-matching framework of memory. We identify four limitations of these models and present a MDP model (a perceptual active inference model) to causally explain a superior production effect of speaking over writing. University students performed a word-production task comprising speaking and writing conditions, followed by a memory test. The results showed main effects of condition on accuracy and response times. The MDP model indicated higher sensory precision during memory retrieval in the speaking condition than in the writing condition. Through Bayesian model selection, we evaluated whether the MDP model, as a mechanistic active-inference model, provided higher construct validity than a descriptive linear model (fit via Variational Laplace). The MDP model outperformed the linear model, suggesting that production modalities are hidden states that cause the visual sensory observation of words that had been linguistically produced. Crucially, the MDP model explains both group effects and individual variability, confirming the reliability paradox of statistical models.

Short Note
Social Sciences
Language and Linguistics

Soheil Daneshzadeh

Abstract:

This article identifies a terminological misrepresentation in the expression “small gatherings cancellation”—ranked by Haug et al. as the most effective non-pharmaceutical intervention during the COVID-19 pandemic. Corpus-based and theoretical analyses demonstrate that small gathering conventionally denotes a planned or spontaneous social event, while the predicate cancellation reinforces this event-based frame. Consequently, the phrase fails to capture the intended reference to restrictions on simultaneous presence in commercial or professional settings. Drawing on cognitive-linguistic theory and institutional usage from the WHO and CDC, this paper shows how such misrepresentation may trigger unintended conceptual frames, leading to interpretive ambiguity in both scholarly and policy contexts. Three alternatives are proposed to achieve better semantic alignment and enhance terminological precision and communicative clarity in future public-health discourse.

Concept Paper
Social Sciences
Language and Linguistics

Luis Escobar L.-Dellamary

Abstract: This paper proposes Trace & Trajectory (T&T) Semantics, a pre-representational framework for understanding meaning as intent-driven navigation through informational space. Motivated by fieldwork with multimodal, intersubjective communication—where meaning emerges through gesture, prosody, and embodied coordination rather than propositional structures—I extend Hoffman and Prakash's trace logic to continuous semantic trajectories. The framework models meaning not through Euclidean feature spaces but through attractor dynamics: meaning stabilizes where intent-driven trajectories converge under dissipative constraints, creating basins that guide navigation without representational anchoring. The critical innovation is operator σ's fractal architecture. As meta-awareness intensifies, trace patterns achieve self-similarity across scales, enabling collapse and reconjunction without infinite regress. This mechanism naturalizes prototype effects, conceptual metaphor, image schema stability, and abstract reasoning as emergent from how conscious agents navigate meaning-space under intent, dissipation, and σ-modulation—not from mental representations. T&T dissolves the hard problem of semantic content by grounding meaning in informational dynamics during concrete intersubjective engagement, where patterns maintain semiotic coherence through intent-driven navigation, without reference to external representational targets. This preserves systematicity while respecting embodied intuition. The framework offers cognitive linguists, anthropologists, and semantic theorists an approach that is formally rigorous (utilizing attractor dynamics, Markov kernels, and σ-operators), empirically tractable (applicable to actual discourse and interaction), and phenomenologically adequate. Crucially, the formalism describes patterns in conscious, intentional dynamics—not neural mechanisms—making it appropriate for phenomena in which agent purpose drives semantic organization.

Article
Social Sciences
Language and Linguistics

Luis Escobar L.-Dellamary

Abstract: Traditional linguistic analysis segments gestures and signs into discrete morphemes—handshape, location, movement—treating these as combinable building blocks. This segmentation, however, reflects analytical resolution rather than ontological structure. At coarse-grained analysis, continuous trajectorial dynamics \textit{appear} segmented because fine distinctions fall below the analytical thresholds of the model's toolkit. This paper argues that gestures and signs function as complete phrasal units whose meaning emerges through navigation rather than morphemic assembly. The supposed ``atoms'' of manual-visual communication are observational artifacts generated by insufficient analytical resolution. We posit that the fundamental unit of gesture and sign is not the morpheme but the trajectory—a continuous navigational arc through informational space guided by conventional traces. Our high-definition approach refers not to perceptual refinement but to analytical granularity: developing theoretical tools capable of tracking trajectorial dynamics at finer informational scales without imposing artificial segmentation. Conventional traces saturate through repeated navigation into specialized attractors (gestural configurations, modal semantics, conceptual prototypes) that emerge as differentiated regions within pre-representational informational space, analogous to stem cell specialization into distinct tissues. This trajectorial approach resolves persistent paradoxes in classifier-predicate analysis, verb segmentation, and gradient iconicity by recognizing them as resolution-dependent phenomena. The analysis extends beyond manual-visual modalities: spoken utterances reveal trajectorial character when examined at sufficient spatiotemporal resolution, challenging written-language models that artificially atomize continuous phonetic-semantic flow. Gestures and signs, precisely because they resist orthographic reification, provide exceptionally clear windows into the trajectorial dynamics underlying all human meaning-making.

Article
Social Sciences
Language and Linguistics

Kazi Abdul Mannan

,

Khandaker Mursheda Farhana

Abstract: This study examines the presence and significance of root words derived from non-Arabic languages in the Holy Quran, with a focus on their Hebrew, Aramaic, Syriac, Greek, Persian, and Ethiopic origins. At the same time, the Quran is traditionally regarded as a purely Arabic revelation, but linguistic and historical evidence reveal the integration of foreign lexical elements into its discourse. This research examines how these borrowed roots were phonologically adapted, morphologically assimilated, and semantically recontextualised through a comparative linguistic analysis, aligning with Quranic themes and theological narratives. The findings indicate that such lexical incorporations were not incidental but somewhat reflective of the multilingual and multicultural context of 7th-century Arabia. Furthermore, the study emphasises the Quran's dynamic linguistic environment, which enabled it to engage diverse audiences while maintaining its claim of ʿArabī mubīn (clear Arabic). By examining selected root words and their original meanings, this paper underscores the Quran’s role as a unifying spiritual text and a linguistic artefact shaped by historical intertextuality. This analysis contributes to broader discussions in Quranic linguistics, comparative Semitic philology, and Islamic theological thought.

Review
Social Sciences
Language and Linguistics

Bing Cheng

,

Yu Zou

,

Xiaojuan Zhang

,

Yang Zhang

Abstract: This study presents a comprehensive bibliometric review of robot-assisted language learning (RALL) from 2003 to 2025, analyzing 439 publications from Web of Science, Scopus, PubMed, and Dimensions. Using Biblioshiny and VOSviewer, we mapped publication patterns, citations, keyword networks, and thematic evolution. Findings revealed steady growth peaking at 71 publications in 2023 before a slight decline in 2024, with China, the Netherlands and the United States emerging as the leading contributors and most cited nations. Keyword clustering identified four themes: educational robot, artificial intelligence, human-robot interaction and children. Thematic evolution analysis revealed a shift from foundational research to a multidisciplinary domain integrating AI, VR, IoT, and LLMs, emphasizing learner-centered designs. However, research remains fragmented and technology-driven rather than grounded in pedagogical frameworks. This review calls for bridging the gap between innovation and theory-grounded robot design. Only through interdisciplinary collaboration and evidence-based practice can RALL fulfill its transformative potential in language education.

Article
Social Sciences
Language and Linguistics

Xiaojuan Zhang

,

Bing Cheng

,

Xi Xiang

,

Yang Zhang

Abstract: Listeners vary in their perception of speech, falling along a continuum from categorical to continuous. We applied a Bayesian computational framework to model this individual difference in speech perception. We analyzed publicly available data (Honda et al., 2024) from 195 participants across four phonetic conditions using both two-alternative forced choice and visual analogue scale tasks. Our model characterizes each listener’s perception using two key parameters: perceptual warping (τ), the signal-to-noise ratio of phonetic encoding, and noise variance (\( \sigma_S^2 \)), a proxy to perceptual noise in experimental designs. Combining these two parameters revealed four perceptual profiles: Veridical (high τ, low \( \sigma_S^2 \)), Categorical (low τ, low \( \sigma_S^2 \)), Compensatory (low τ, high \( \sigma_S^2 \)), and Noisy (high τ, high \( \sigma_S^2 \)). These profiles predicted behavioral patterns coherently, while successfully distinguishing between listeners who would appear similar when characterized by behavioral measures alone. Critically, results revealed that profile distributions shifted dramatically based on phonetic conditions, with primary cues yielding a balanced mix of profiles and secondary cues producing distributions skewed heavily toward Veridical and Compensatory listeners (80%). Underscoring this flexibility, intraclass correlations for both τ and \( \sigma_S^2 \) ​ were zero, with phonetic condition effects 30 times stronger for \( \sigma_S^2 \) (χ² = 803.91) than τ (χ² = 29.47). These findings challenge the traditional view of categorical perception as a fixed characteristic, demonstrating instead that it is a flexible, context-driven perceptual state.

Article
Social Sciences
Language and Linguistics

Bing Cheng

,

Xiangrong Dai

,

Xi Xiang

,

Xiaojuan Zhang

,

Yang Zhang

Abstract:

This study introduces the root mean square error (RMSE) as a new metric for quantifying gradient speech perception in visual analog scale (VAS) tasks. By measuring the deviation of individual responses from an ideal linear mapping between stimulus and percept, RMSE offers a theoretically transparent alternative to traditional metrics like slope, response consistency, and the quadratic coefficient. To validate these metrics, we first used simulated data representing five distinct perceptual response profiles: ideal gradient, categorical, random, midpoint-biased, and conservative. The results revealed that only RMSE correctly tracked the degree of true gradiency, increasing monotonically from the ideal gradient profile (RMSE = 5.48) to random responding (RMSE = 42.16). In contrast, traditional metrics failed critically; for example, slope misclassified non-gradient, midpoint-biased responding as highly gradient (slope = 0.24). When applied to published empirical VAS data, RMSE demonstrated strong convergent validity, correlating robustly with response consistency (r ranging from -0.44 to -0.89) while avoiding the ambiguities of other measures. Crucially, RMSE exhibited moderate-to-high cross-continuum stability (mean r = 0.51), indicating it captures a stable, trait-like perceptual style. By providing a more robust and interpretable measure, RMSE offers a clearer lens for investigating the continuous nature of phonetic categorization and individual differences in speech perception.

Article
Social Sciences
Language and Linguistics

Xiaojuan Zhang

,

Bing Cheng

,

Yang Zhang

Abstract: The mechanisms linking speech production and perception remain underspecified, particularly in how segmental and suprasegmental features are processed across different contextual variations. This study investigated whether perceptual cue weighting could be predicted by distributional reliability of acoustic cues in production, focusing on the Mandarin Tone 2-Tone 3 contrast across both gradient coarticulatory (T1, T2, T4) and categorical tone sandhi (T3) contexts. We quantified production distributional reliability using the Bhattacharyya coefficient and assessed perceptual cue weighting through relative weight analysis. Bayesian mixed-effects modeling showed strong evidence for context-dependent acoustic distributions in production (BF₁₀ = 9.87 × 10²⁸) and perception (BF₁₀ = 4.56 × 10153). Critically, production-perception coupling emerged selectively. In gradient contexts, higher production reliability strongly predicted perceptual weighting (BF₁₀ = 12.48), with robust negative correlations for critical cues in T2 (Cohen’s d = -2.51, 95% CI [-2.93, -2.09]) and T4 contexts (d = -1.76, 95% CI [-2.28, -1.26]), but not in T1 context (d = -0.30, 95% CI [-1.02, 0.43]). No such coupling was observed for secondary cues across contexts (|d| < 0.8). In contrast, in the categorical T3 sandhi context, production statistics did not predict perceptual weights. These findings reveal a context-sensitive production-perception relationship: tightly coupled in gradient coarticulatory contexts, but dissociated in categorical rule-governed environments. This pattern supports a dual-route model for tone processing involving a statistical-auditory stream for phonetic variations and a symbolic-phonological stream for abstract alternations.

Article
Social Sciences
Language and Linguistics

Bing Cheng

,

Xi Xiang

,

Xiangrong Dai

,

Yu Zou

,

Xiaojuan Zhang

,

Yang Zhang

Abstract: Purpose: While the Visual Analog Scale (VAS) has revealed gradient perception in segmental speech sounds, its application to lexical tones, a critical yet understudied suprasegmental feature, has been absent. This study investigated lexical tone categorization using VAS, directly comparing it with traditional two-alternative forced-choice (2AFC). Method: Eighty-four native speakers categorized an 11-step F0 continuum from Mandarin Tone 1 to Tone 2 in both tasks. Four-parameter logistic functions yielded slope (categorization sharpness) and response variability. Within-category sensitivity (Δ) was quantified from VAS responses.Results: Paired Wilcoxon signed-rank tests showed significantly shallower slopes (p < .001, r = .76) and lower variability (p < .001, r = .87) in VAS versus 2AFC. One-sample t-tests confirmed listeners discriminated fine-grained differences within categories, with Δ reliably exceeding zero (left: M = 0.0335, t(270) = 8.89, p < .001; right: M = 0.0256, t(316) = 8.38, p < .001). Crucially, slope and response variability were weakly correlated in VAS (ρ = .27, p < .05) but strongly negatively correlated in 2AFC (ρ = -.67, p < .001). Moreover, response variability correlated significantly across tasks (ρ = .40, p < .001), while slopes did not. Conclusion: These findings provide the direct evidence for gradient perception at the suprasegmental level, further establishing VAS as a sensitive tool for uncovering the nature of speech categorization. The dissociation between task-dependent gradiency and stable response variability helps reconcile apparent conflicts in the categorical perception literature, suggesting that these conflicts may stem from methodological constraints rather than genuine theoretical disagreements.

Short Note
Social Sciences
Language and Linguistics

Jaime A. Teixeira da Silva

Abstract: The World Health Organization (WHO) is an authoritative global body that focuses on health-related issues globally. Given its prominence, and as a gesture of respect, it is important to accurately represent the organization’s name. Using the Problematic Paper Screener, an AI-driven software that identifies linguistic distortions, or ‘tortured phrases’, “World Wellbeing Association” was searched on 30 November 2024, identifying 59 documents. After exclusions, 19 documents were examined, almost all of which were open access, most mentioning this ‘tortured phrase’ once, to define WHO. One of those documents has been retracted. WHO is advised to reach out to the authors and journals to request a correction of their misrepresented organization in the scientific literature.

Article
Social Sciences
Language and Linguistics

Hongbing Huang

,

Yaru Meng

,

Lingjie Tang

,

Yu Cui

,

Liang Xu

Abstract: The present study employed latent profile analysis (LPA) to explore 586 Chinese college students’ profiles of self-efficacy in English writing MOOC learning context. Based on LPA results, differences in linguistic self-efficacy, self-regulatory efficacy, performance self-efficacy were compared. LPA generated three profiles of students’ self-efficacy, i.e., low on all self-efficacy, average self-efficacy, moderately high self-efficacy. The results of ANOVAs revealed that the three profiles showed significant differences in interaction and discussion, perseverance in online learning, attitude value, preference value, flexibility in online learning. These findings of the present study provide constructive insights into the subclasses of students’ self-efficacy and have significant implications for the self-efficacy research and its implementation in educational intervention in college English writing MOOC based second language acquisition to better improve learners’ English writing proficiency.

Article
Social Sciences
Language and Linguistics

Brian Herreño Jiménez

,

Sánchez Sánchez Raúl

,

Alcaraz Carrión Daniel

,

López Bernal Ariadna

,

Pagán Cánovas Cristóbal

Abstract: We present preliminary results of a new methodology to study co-speech gesture in relation to specific linguistic structures. We draw on a large-scale video repository with time-aligned transcripts to build corpora in which the same linguistic expression is uttered by different speakers across multiple clips. We then extract dynamic coordinates of key body points to model their variation in relation with what is being said. In this paper, we present analyses of the distribution of wrist motion in gesture space for a dataset of 379 videos with utterances of 44 deictic time expressions in English (words or phrases pointing at the past, present, or future in relation to a center of temporal reference, e.g. “yesterday/today/tomorrow”). Even overall distributions of wrist positions in peripheral areas of gestural motion turn out to be influenced by these semantic distinctions. More fine-grained models are to be expected from the reconstruction of gestural trajectories, based on the chronological sequence of positions detected in each video. These initial results already suggest that so-called non-verbal behavior is deeply structured and attuned to language, quite beyond our current understanding. Once scaled up, such models have the potential to dramatically change any technologies connected to human communicative behavior.

Article
Social Sciences
Language and Linguistics

Alejandro Curado Fuentes

Abstract: In academic L2 English / EFL (English as a Foreign Language) writing, GenAI (Generative Artificial Intelligence) and other digital tools are being extensively explored. However, this AI exploration for academic / research writing has been addressed less at postgraduate levels, and even less so, according to different scientific fields. This study examines this topic within Social Sciences at University of Extremadura, Spain. Seven participants with a B2 English level or higher enrolled in a 10-hour hybrid course about GenAI for academic English writing in October and November of 2024, focusing on AI tools and Broad Data-Driven Learning (BDDL) resources (e.g., simple online corpora tools) to assist their writing. Participants’ feedback was collected by qualitative means (in-class discussions, task writing annotation, and final survey). Overall findings indicate notably positive responses and usage of these tools for the improvement of their texts (e.g., linguistic analysis, lexical-grammatical refinement, and text style improvement). Participants also revealed miscellaneous approaches and strategies in their management of GenAI. Despite the study’s small sample, these preliminary findings suggest that postgraduate researchers in Social Sciences combine expert and linguistic knowledge effectively, demonstrating linguistic awareness and digital literacy concerns.

Article
Social Sciences
Language and Linguistics

Imed Reese Sy

Abstract: This review draws insights into the technical, historical, and socio-economic dimensions of AI’s rapid transformation. It traced AI’s progression from symbolic rule-based systems to data-driven statistical learning and deep neural networks, showing how advances in computational power, optimization methods, and large-scale data curation have enabled breakthroughs in perception, language, and decision-making. A historical lens underscores that contemporary innovations build on decades of research while leaving core challenges—such as interpretability, robustness, and sample efficiency—unresolved. Empirical analyses of reinforcement learning, transformer-based language models, and hybrid architectures reveal performance gains alongside persistent vulnerabilities, including adversarial susceptibility and contextual misinterpretation. Socio-economic assessments highlight AI’s dual role in boosting productivity and reshaping labor markets, with automation complementing high-skill tasks but displacing routine work. Bias detection experiments confirm that training data inequities can propagate into system outputs, reinforcing calls for fairness-centered design and governance. The study finds that AI adoption is uneven across regions and sectors, risking a widening digital divide. It emphasizes the necessity of robust, adaptive ethical and legal frameworks, cross-sector collaboration, and integration of AI literacy into education systems. Recommendations include advancing explainable AI to address “black box” concerns, fostering public-private partnerships for responsible innovation, and establishing international ethical guidelines informed by diverse cultural perspectives. Overall, the research concludes that AI’s trajectory must be guided by proactive governance, interdisciplinary engagement, and equitable access strategies to ensure its evolution enhances human well-being, supports sustainable development, and aligns with societal values.

Article
Social Sciences
Language and Linguistics

Edgar R. Eslit

Abstract: English language teaching today is saturated with methods that promise fluency, precision, or communicative ease—yet beneath this crowded landscape lies a deeper crisis: learners are trained to perform, not to reckon; to comply, not to narrate. Narrative Language Ecology (NLE) responds by reimagining language learning as a lived, ethical, and ecological act, aligned with UNESCO’s Sustainable Development Goals, particularly those advancing inclusive, equitable, and quality education. In NLE, each macroskill—listening, speaking, reading, writing, viewing, and representing—is treated not as a technical outcome but as a diagnostic entry point into context, agency, and reform. Learners engage with soundscapes, silences, and stories that demand presence and provoke reflection. Technology and AI are not shortcuts—they are narrative-critical instruments that flag bias, scaffold ethical clarity, and amplify rhythm without erasing voice. Values are structurally embedded, not decoratively appended, ensuring that every output breathes with empathy, accountability, and social relevance. NLE does not teach language as a system; it teaches language as a purposeful way of life—one shaped by reflection, responsiveness, and meaningful engagement with the world.

Article
Social Sciences
Language and Linguistics

Tedros Kifle Tesfa

Abstract: This manuscript introduces the Law of the Trio—a groundbreaking framework that models language, thought, and reality as structurally equivalent modalities of existence. Departing from traditional views that treat meaning as a referential or symbolic mapping, this work reframes meaning as a recursive ontological function: a dynamic enactment of being expressed through linguistic form. Each sentence is conceptualized as a “semantic particle,” encoding the triadic coupling of entity, state or behavior, and recursive modifiers across symbolic, cognitive, and perceptual domains. Meaning is not merely assigned—it is structured, layered, and invoked. Through the EMji/VMji notation system, the manuscript formalizes modifier hierarchy and semantic depth, enabling precise modeling of how meaning unfolds recursively within and across sentences. This approach bridges cognitive linguistics, semiotics, and structural ontology, offering a universal semantic function: Modality = f(Entity, State or Behavior). This function applies across languages, cultures, and cognitive systems, enabling semantic invariance and cross-modal alignment. The EMji/VMji system provides a formal scaffold for syntactic complexity and cognitive clarity, reframing sentence structure as recursive semantic geometry.Applications span first and second language acquisition, semantic parsing in artificial intelligence, intercultural communication, and philosophical modeling of meaning. By treating language as ontological geometry rather than syntactic engineering, the Law of the Trio invites a reimagining of linguistics as a science of structured resonance—where every utterance becomes a symbolic act of being.

Article
Social Sciences
Language and Linguistics

Peter T. Richtsmeier

,

Michelle W. Moore

Abstract: This study investigated whether statistical learning from one linguistic input influences learning of a subsequent input in preschool-aged children. The first input comprised word-medial consonant sequences in an "alien" word-learning experiment (e.g., /st/ or /nt/ in alien names). The second input comprised different consonant sequences (e.g., /sk/ and /ns/) in a follow-up "make-believe animal" statistical learning experiment . Thirty-four children were pseudorandomly assigned to one of two word sets in the initial alien experiment and subsequently completed a statistical learning task with manipulated the experimental frequency of the consonant sequences in the make-believe animal names. We hypothesized that prior exposure to /st/ or /nt/ in the alien names would lead to predictable phonological errors—fronting or stopping of /sk/ and /ns/, respectively—in the make-believe animal task. Mixed-effects and ordinal regression analyses revealed no significant effects of alien word set on error patterns or production accuracy for make-believe animal names, though experimental frequency influenced accuracy differently across consonant sequences. These findings suggest that while statistical learning is sensitive to input frequency, cross-experiment interactions may require greater similarity between inputs to manifest. The study contributes to the growing literature on multistream statistical learning and highlights the complexity of generalization in child speech development.

Article
Social Sciences
Language and Linguistics

Katharina Ehret

Abstract: This paper assesses the role of socio-demographic triggers on Kolmogorov-based complexity in spoken English varieties. It thus contributes to the ongoing debate on contact and complexity in the sociolinguistic typological research community. Currently, evidence on whether socio-demographic triggers, in particular the proportion of non-native speakers and the number of native speakers which are common proxies for language contact, influence morphosyntactic complexity of languages is controversial and inconclusive. In order to illuminate the issue from an English-varieties perspective, I use regression analysis to test several socio-demographic triggers in a corpus database of spoken English varieties. Language complexity here is operationalised in terms of Kolmogorov-based morphological and syntactic complexity. The results only partially support the idea that socio-demographic triggers influence morphosyntactic complexity in English varieties, i.e. speaker-related triggers turn out to be negative but non-significant. Yet, net migration rate shows a positive significant effect on morphological complexity which needs to be seen in the context of English as a commodity and unequal access to English. I thus argue that socioeconomic triggers are better predictors for complexity than demographic speaker numbers. In sum, the paper opens up new horizons for research on language complexity.

of 7

Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated