Natural Language Processing in the Era of Large Language Models: Foundations, Integration, and Low-Resource Frontiers

Monisha Gottam

doi:10.20944/preprints202603.0570.v1

Submitted:

06 March 2026

Posted:

06 March 2026

You are already at the latest version

Abstract

Large Language Models (LLMs) have fundamentally transformed the landscape of Natural Language Processing (NLP), subsuming and redefining tasks that were once addressed by specialized, modular pipelines. This paper surveys the role of classical and contemporary NLP within modern LLM architectures, examining how foundational techniques — tokenization, syntactic parsing, semantic representation, and discourse modeling — have been absorbed into, and continue to inform, the pre-training and fine-tuning paradigms of transformer-based models. We further investigate the critical challenge of linguistic inclusivity, focusing on low-resource and morphologically complex languages that remain underserved by dominant English-centric corpora. Drawing on recent advances in cross-lingual transfer learning, multilingual pre-training, and data augmentation, we assess the progress and persistent gaps in extending LLM capabilities to such languages. Case studies on Southeast Asian, African, and indigenous language NLP toolkits illustrate practical strategies and remaining bottlenecks. We conclude by outlining open research directions at the intersection of structural NLP and generative AI.

Keywords:

large language models

;

natural language processing

;

low-resource languages

;

multilingual NLP

;

transfer learning

;

tokenization

;

transformer architecture

;

cross-lingual transfer

Subject:

Computer Science and Mathematics - Artificial Intelligence and Machine Learning

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Natural Language Processing in the Era of Large Language Models: Foundations, Integration, and Low-Resource Frontiers

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe