AI-Enabled Generative Design of Immune Cells and Receptors for Programmable Immunity

Ola A Al Ewaidat; Moawiah M Naffaa

doi:10.20944/preprints202511.0986.v1

Submitted:

12 November 2025

Posted:

13 November 2025

You are already at the latest version

Abstract

Recent advances in generative artificial intelligence have begun to redefine the practice of cell and immune engineering. By learning the statistical and structural grammar of biological systems, generative models can now design T-cell receptors, chimeric antigen receptors, and synthetic immune circuits that meet complex objectives of affinity, stability, and specificity. When integrated into automated design–build–test–learn pipelines, these models enable continuous cycles of hypothesis generation, experimental validation, and model refinement, creating a closed feedback loop between computation and biology. This review examines how AI-driven generative design is transforming immunoengineering across multiple scales, from molecular recognition to cellular phenotype and clinical translation. It discusses the foundational architectures that support generative modeling in biology, the emergence of adaptive biofoundries that link digital design to manufacturing, and the translational pathways through which programmable immune cells may enter clinical practice. The review also explores the ethical and regulatory dimensions of algorithmic biology, emphasizing the need for transparency, equitable access, and anticipatory governance. Together, these developments signal the rise of a new paradigm, programmable immunity, in which biological design, therapeutic discovery, and ethical responsibility evolve within a single, intelligent framework.

Keywords:

generative immunoengineering

;

programmable immunity

;

cell therapy

;

T-cell receptor design

;

chimeric antigen receptor (CAR-T)

;

synthetic biology

;

AI-driven biomanufacturing

;

design–build–test–learn (DBTL) loop

;

biofoundry automation

;

algorithmic ethics

;

translational immunology

;

adaptive regulation

Subject:

Medicine and Pharmacology - Neuroscience and Neurology

1. Introduction

The immune system represents one of the most intricate adaptive architectures in biology, capable of sensing, learning, and remembering through dynamic molecular and cellular computation. Each T-cell receptor, B-cell receptor, and antibody embodies a combinatorial experiment in recognition, shaped by stochastic recombination and refined by selection [1]. This distributed intelligence endows immunity with extraordinary specificity and plasticity, but it also renders therapeutic design immensely complex. Efforts to reprogram immune function, whether by engineering monoclonal antibodies, constructing chimeric antigen receptors (CARs), or modulating regulatory cell lineages, have historically depended on rational and empirical strategies [2]. Such methods rely on human intuition, incremental mutagenesis, and extensive screening to optimize binding, stability, and signaling properties. Despite notable clinical success, this heuristic paradigm is inherently constrained. The potential design space of immune receptors and cell states is vast, multidimensional, and only sparsely sampled by experimentation [3,4].

In recent years, the emergence of generative artificial intelligence has introduced a qualitatively new mode of biological design. Unlike traditional predictive algorithms that classify or score existing data, generative models learn the underlying probability distributions of biological sequences, structures, and phenotypes, enabling them to synthesize novel entities consistent with the statistical grammar of life [5,6]. The same class of transformer and diffusion architectures that transformed natural-language and image generation, exemplified by models such as ESM-2, ProteinMPNN, and RFdiffusion, has been adapted to protein science, capturing contextual and geometric dependencies across millions of sequences [7,8]. These advances allow algorithms to infer latent representations that link sequence to structure and structure to function, providing a foundation for de novo design rather than retrospective optimization.

When applied to immunology, such models give rise to what can be described as generative immunoengineering, the computational creation of immune molecules and cellular programs guided by learned representations of biological principles. Early demonstrations illustrate this emerging capability. PhysicoGPTCR integrates large-language modeling with physicochemical conditioning to generate T-cell receptor (TCR) sequences with specified antigen context [9]. ProteinMPNN-TCR further couples sequence generation with structural inference to propose antigen-specific receptors that remain within realistic conformational manifolds [10]. Beyond receptor design, multimodal generative frameworks are beginning to link transcriptomic, proteomic, and signaling data to the prediction and ultimately the design of cellular phenotypes. These developments suggest that immune cells themselves may soon be programmable entities whose functional repertoires can be expanded algorithmically [11,12].

This transition from rational design to generative creation alters not only the technical workflow but also the epistemological foundations of cell engineering. The conventional design–build–test–learn (DBTL) cycle, a foundational framework in synthetic biology that iteratively connects computational design, experimental construction, performance testing, and data-driven learning, has been limited by experimental throughput. It is now being supplanted by a form of closed-loop intelligence, in which generative models and automated experimentation operate within adaptive DBTL frameworks under human oversight. In this architecture, model-generated hypotheses iteratively guide empirical validation, while experimental outcomes continuously refine model priors, forming a self-improving computational–biological feedback system [13]. The laboratory becomes a feedback system, an adaptive interface between computation and biology. Such integration has already accelerated antibody discovery, improved TCR-peptide–HLA binding prediction and inspired automated screening pipelines driven by active learning [14]. In the longer term, the coupling between in silico generation and in vitro verification may lead to self-improving biomanufacturing ecosystems capable of producing tailored immune therapies with unprecedented speed and precision [15].

The opportunities of this paradigm are accompanied by significant challenges. The immune repertoire data that underpin generative training remain incomplete and biased toward particular species, diseases, and sequencing modalities, which raises concerns about generalizability and off-target risk [16]. The interpretability of high-capacity models is limited, and the regulatory frameworks governing AI-designed biologics are still emerging. Furthermore, the capacity to generate vast numbers of synthetic receptor sequences demands robust ethical and biosafety oversight to prevent unintended immunogenicity or dual-use misuse [17]. Addressing these issues will require collaborative standards that integrate computational transparency, experimental reproducibility, and normative guidance from clinical immunology and bioethics.

This review examines the convergence of artificial intelligence and immune cell engineering, focusing on how generative algorithms are reshaping the design landscape of receptors, signaling modules, and cellular behaviors. We synthesize conceptual foundations, survey recent technological advances, and outline the translational and governance challenges that accompany this accelerating field. The goal is to articulate a coherent framework for AI-enabled generative design of immune cells and receptors for programmable immunity, a paradigm that transforms the immune system from a biological phenomenon into a programmable platform in which data, learning, and design converge to expand the grammar of therapeutic possibility (Figure 1). In this context, programmable immunity denotes the AI-enabled capacity to computationally design and regulate immune functions with defined precision, spanning receptor–antigen recognition, intracellular signaling, and cellular state transitions. It envisions an intelligent interface in which generative algorithms, molecular data, and synthetic biology converge to engineer immune behaviors in silico and validate them experimentally. Conceptually, programmable immunity reframes the immune system as a reconfigurable information network—continuously learnable, optimizable, and expressible through the generative grammar of biology.

2. The Generative Turn in Immunoengineering

For much of modern biotechnology, the design of immune therapeutics has followed a rational–empirical model rooted in target identification, scaffold selection, and iterative optimization. Breakthroughs such as monoclonal antibodies and the first generations of CAR-T cells emerged from this approach [18]. Yet the dependence on sequential mutagenesis and experimental screening imposes severe constraints on both scale and efficiency. The theoretical diversity of immune receptor sequences exceeds experimental capacity by many orders of magnitude [19], making comprehensive exploration of the antigen–receptor landscape unattainable and creating a persistent bottleneck in immune engineering [20].

The introduction of generative artificial intelligence has begun to transform this landscape. Unlike discriminative algorithms that categorize existing data, generative models learn the statistical patterns that define sequence–structure–function relationships and can therefore synthesize novel candidates consistent with those learned principles [21]. In this view, the immune system becomes not merely an object of analysis but a source of linguistic and structural priority from which algorithms infer the grammar of recognition [22]. Trained on large immune-repertoire datasets and receptor–antigen complexes, these models can extrapolate to regions of sequence space that have not been sampled by natural evolution yet remain statistically and biophysically coherent. The result is a shift from selective discovery to probabilistic creation [23].

Advances in protein foundation models provide computational architecture for this transformation. Transformer-based language models such as ESM-2, ProtT5, and MSA Transformer capture contextual dependencies across millions of sequences [24]. Diffusion and graph-neural architectures such as ProteinMPNN, RFdiffusion, and Chroma encode geometric constraints that preserve structural integrity [25]. When these models are fine-tuned on immunoglobulin, TCR, or antibody datasets, they learn both sequence syntax and physicochemical rules of folding and binding [26]. Conditional generation techniques allow the integration of antigenic or peptide–HLA information so that sequence generation becomes target-aware design with affinity, specificity, and stability as explicit optimization goals [27].

Within immunoengineering, this generative capability is reshaping multiple domains. At the receptor level, models such as PhysicoGPTCR and ProteinMPNN-TCR co-model sequence, structure, and epitope context to generate receptor variants optimized for stability and antigen recognition [9,28]. At the construct level, emerging frameworks extend these generative principles to the modular design of chimeric receptors integrating variable fragments, hinge regions, and intracellular signaling domains to tune activation strength, expression, and safety [29]. Beyond receptors, multimodal generative models that integrate transcriptomic and proteomic profiles can infer regulatory or metabolic configurations that stabilize desirable cellular states. This emerging capability defines a new phase of AI-assisted cellular design [30].

The broader consequence of this transition lies in the restructuring of experimental reasoning. Traditional discovery pipelines rely on experimental data to generate hypotheses. In the generative framework, models propose candidates that guide experiments, and experimental outcomes continually refine model priors, creating a self-reinforcing design–build–test–learn cycle [15]. Laboratories are evolving into adaptive systems in which computational and biological processes operate in tandem, accelerating the translation of insight into function [31]. This feedback architecture mirrors closed-loop control in engineering and signals the rise of self-optimizing biomanufacturing platforms.

Despite these advances, the generative turn requires careful evaluation. Models trained in incomplete or biased repertoires may produce sequences that violate structural or safety constraints [32]. Ensuring safety, interpretability, and reproducibility in AI-generated biologics demands comprehensive benchmarking, transparent documentation, and the establishment of regulatory frameworks suited to algorithmic design [33]. Generative systems should therefore be viewed as collaborators rather than replacements for scientific expertise, augmenting experimental insight rather than automating it.

Viewed through this lens, generative immunoengineering is both a technological and conceptual redefinition. Computation no longer merely represents biological systems; it participates in their creation. The ability to generate plausible, functional immune receptors ab initio transforms design from an empirical craft into an algorithmic discipline and expands the creative frontier of synthetic biology [34]. The subsequent sections examine the architecture, data resources, and experimental integrations that together define this emerging paradigm. A convergence of artificial intelligence and immunological engineering directed toward programmable immunity.

3. Foundations of Generative Biology for Immune Systems

The application of generative artificial intelligence to immunoengineering rests upon foundational advances in computational biology that have redefined how protein sequences, structures, and functions are represented. Over the past five years, models originally designed for natural language processing have been adapted to the protein domain, where amino acid sequences are treated as biological sentences governed by an implicit grammar of evolution [35]. This conceptual analogy between linguistic and molecular syntax has allowed transformer architectures, recurrent neural networks, and diffusion-based frameworks to learn the contextual dependencies that link sequence motifs to structural and functional outcomes. These models now provide the representational backbone for generative biology, enabling the creation of new molecular entities that adhere to the learned statistical rules of natural proteins [36].

At the core of this transformation lies the protein language model (pLM). Early models such as UniRep and ProtBERT demonstrated that contextual embeddings derived from millions of sequences could capture latent biophysical properties including secondary structure propensity, binding-site probability, and thermostability [37,38]. More recent architectures such as ESM-2 and MSA Transformer incorporate evolutionary and multiple-sequence alignment information, achieving state-of-the-art performance in structure prediction and mutational effect inference [39,40,41]. These embeddings serve as a universal representation that can be fine-tuned for diverse downstream tasks, including receptor generation, antigen classification, and affinity optimization. In the context of immune receptor design, such representations provide a high-dimensional landscape in which antigen specificity and receptor stability can be jointly optimized by generative sampling [42].

Complementary to language-based methods are structure-aware generative models that explicitly incorporate geometric and energetic constraints. Graph neural networks, energy-based models, and diffusion frameworks such as ProteinMPNN, RFdiffusion, and Chroma model the conditional probability of atomic arrangements given a target fold or binding interface [25,43,44]. By learning from experimentally determined structures and molecular dynamics simulations, these models capture the geometric invariants that define stable tertiary and quaternary conformations. When applied to immune complexes, structure-aware generators can propose receptor sequences that maintain structural fidelity while accommodating specific epitope geometries, a key requirement for accurate antigen engagement[15]. The ability to jointly model sequence and structure distinguish these approaches from earlier heuristic design pipelines and provides a direct pathway from digital generation to physical synthesis.

Recent innovations have begun to integrate multimodal generative architectures that combine sequence, structure, and system-level data into unified frameworks. Variational autoencoders and diffusion models trained on multi-omic datasets can embed gene expression, protein abundance, and signaling dynamics within shared latent spaces [45,46]. These models enable not only receptor-level design but also the generation of synthetic cell states, offering predictions of how genetic or metabolic perturbations may influence functional phenotypes. When aligned with single-cell RNA sequencing, proteomic profiling, and CRISPR perturbation data, multimodal models allow researchers to simulate the outcomes of cellular reprogramming before executing experimental interventions [47]. This integration forms the conceptual foundation of generative immunoengineering, where molecular design and cell-state control converge through shared representational learning.

Another essential component of this foundation is conditional generation, which introduces explicit control variables into the generative process. Conditioning can be achieved through structural templates, physicochemical features, antigenic context, or phenotype-level objectives [48]. In receptor engineering, conditional models can generate sequences that maximize predicted binding affinity to a specified epitope while minimizing cross-reactivity and immunogenicity [49,50]. In cell engineering, conditioning may involve the specification of transcriptional or metabolic profiles corresponding to desired states such as resistance to exhaustion, enhanced memory formation, or altered cytokine secretion [51,52]. By encoding these objectives into the generative process, models move beyond unsupervised creativity toward constrained biological design that aligns with therapeutic goals.

The reliability of generative models depends critically on the quality, diversity, and annotation of training data. Immune-repertoire datasets such as IEDB, VDJdb, OAS, and PIRD provide millions of receptor sequences, yet these datasets are biased toward particular species, disease contexts, and sequencing platforms [53]. Structural datasets remain comparatively sparse, limiting model generalization for certain receptor classes or antigen types. Integrating curated experimental datasets with synthetic augmentation strategies, including contrastive learning and adversarial perturbation, has emerged as a strategy to expand functional diversity while maintaining biological plausibility [54]. Standardization of data formats and metadata annotations is equally important for ensuring reproducibility across laboratories and for benchmarking model performance under transparent conditions [55].

The theoretical strength of these models is matched by their capacity for interpretability and embedding analysis, which enables a mechanistic understanding of learned representations. Techniques such as attention visualization, feature attribution, and latent-space interpolation have revealed that protein language models implicitly capture biochemical hierarchies reminiscent of evolutionary phylogenies [39,56]. In immune systems, these embeddings encode information about complementarity-determining region (CDR) composition, binding topology, and germline lineage relationships [57]. Understanding how models internalize these features not only enhances trust in generative predictions but also provides a new lens through which to examine the informational logic of the immune repertoire itself.

Together, these methodological pillars establish the computational grammar of generative biology. The convergence of language-based, structure-aware, and multimodal approaches provides the mathematical substrate on which immune receptor and cell-state generation can occur with controllable precision. As models become larger and more contextually integrated, they begin to approximate a universal generator of biomolecular function, one capable of producing sequences, folds, and regulatory motifs that are both novel and biologically viable. In the context of immunoengineering, this synthesis enables the design of receptors, signaling networks, and phenotypic programs guided not by human intuition alone but by statistical representations of evolution and function embedded within artificial intelligence [58,59].

The next section will examine how these generative foundations are being applied to the design of immune repertoires and antigen-specific recognition, focusing on model architectures, conditioning strategies, and validation frameworks that connect digital generation to experimental reality.

4. Learning Immune Specificity from Repertoires and Structures

The capacity of the adaptive immune system to distinguish self from non-self and to recognize an effectively unlimited range of antigens arises from the extraordinary diversity of its receptor repertoires. Each T-cell and B-cell receptor represents a molecular hypothesis drawn from the combinatorial space of V(D)J recombination, junctional variability, and somatic hypermutation [60]. Mapping this diversity and translating it into actionable design principles have long challenged immunology. Generative artificial intelligence now provides new tools for capturing the probabilistic architecture of immune specificity by learning directly from large-scale receptor and antigen datasets [61,62].

Immune-repertoire learning relies on sequence datasets such as IEDB, VDJdb, OAS, and PIRD, which together contain millions of annotated T-cell and B-cell receptor sequences linked to antigenic or disease contexts [63,64,65]. Deep representation models trained on these corpora can learn statistical signatures that define clonotype structure, CDR usage, and gene-segment pairing preferences [66]. Early models such as DeepTCR, TCR-BERT, and Immune2Vec demonstrated that unsupervised embeddings derived from raw sequence data capture functional and evolutionary relationships between receptors [4,67,68]. These embeddings have subsequently become the basis for generative modeling, enabling conditional sampling of sequences that preserve repertoire-level statistics while exploring unsampled regions of sequence space.

The extension of this approach to structure-informed modeling has further refined our understanding of immune recognition. High-resolution structural data from crystallography and cryo-electron microscopy, combined with molecular-dynamics simulations, provide explicit information about how complementarity-determining loops engage peptide–MHC complexes or conformational epitopes [69]. Graph neural networks and diffusion-based models such as ProteinMPNN-TCR, AlphaBind, and ImmuneDiffusion encode both sequential and geometric features, allowing generation or evaluation of receptor variants that maintain structural stability while optimizing epitope complementarity [49,70,71]. By unifying sequence and structural representations, these models can predict or design receptors that balance affinity, cross-reactivity, and biophysical feasibility.

Learning immune specificity also depends on understanding contextual conditioning, in which receptor generation is guided by features of the antigen or by the cellular environment in which binding occurs. Conditional generative frameworks incorporate peptide–HLA embeddings, physicochemical descriptors, and even transcriptomic signatures of the responding cell population [72]. This conditioning allows models to generate receptors tailored to particular epitopes or immunological niches rather than relying on global repertoire statistics. The resulting designs can be filtered by computational docking, binding-energy prediction, or molecular-dynamics relaxation to ensure structural plausibility and to screen for potential off-target interactions [73,74].

An equally important development is the use of contrastive and active-learning strategies that couple model training to experimental feedback. High-throughput binding assays, yeast or mammalian display systems, and single-cell sequencing technologies provide empirical data that refine model priors through iterative updates [75,76,77]. Active learning enables the model to identify regions of uncertainty and to propose new receptor sequences whose experimental testing would maximize information gain. This closed-loop process progressively enhances model fidelity and creates an adaptive design framework that mirrors the iterative nature of immune evolution itself.

Despite these advances, challenges remain in ensuring that learned representations reflect biological causality rather than statistical correlation. Sequence redundancy, sampling bias, and limited negative examples can inflate apparent model accuracy while masking gaps in functional understanding [78]. The field is responding through the creation of standardized benchmarking platforms, curated cross-reactivity datasets, and transparent reporting of validation metrics that measure generalization across antigen classes and experimental systems [79,80]. Incorporating structural energetics, thermodynamic parameters, and molecular-simulation outputs into the training regime further grounds model predictions in biophysical reality [81,82].

Together, these developments illustrate a convergence between data-driven learning and structural immunology. By jointly modeling sequence, structure, and antigenic context, generative and representation models are beginning to reconstruct the rules that govern immune recognition. In practical terms, they offer the capacity to generate receptor repertoires that are both diverse and functionally directed, to predict the cross-reactivity landscape of candidate therapeutics, and to guide the rational expansion of immune libraries toward desired antigen spaces. The synthesis of repertoire-scale learning with structural modeling thus forms the methodological core of generative immunoengineering, providing the analytical framework through which specificity can be understood, predicted, and designed (Table 1).

5. AI-Driven Design of Immune Receptors and Constructs

The emergence of generative artificial intelligence has transformed the design of immune receptors from an empirical pursuit into a computational discipline. Models that learn sequence–structure–function relationships across vast biological corpora can now propose receptor variants and modular constructs that meet predefined design constraints for affinity, stability, signaling balance, and manufacturability [83]. This integration of algorithmic inference with molecular immunology redefines the creative boundary of immune engineering, transforming receptor discovery into a process of directed generation guided by learned biological priors [84].

5.1. Designing Antigen-Specific Receptors

The adaptive immune system recognizes antigens through an immense repertoire of receptor sequences that encode highly specific binding topologies. Generative models trained on large-scale T-cell receptor (TCR) and antibody repertoires can recapitulate and extend this natural diversity. Transformer-based language models such as physics-informed generative AI to T-cell receptor (TCR) design (PhysicoGPTCR) employ contextual embeddings that capture residue-level physicochemical properties while conditioning generation on peptide–HLA or epitope descriptors [9,85,86]. By sampling from latent spaces that integrate both sequence statistics and antigenic context, these models generate plausible receptor candidates that occupy unobserved yet biologically coherent regions of sequence space.

Structure-aware networks further refine this capacity. Frameworks such as ProteinMPNN-TCR, AlphaBind, and ImmuneDiffusion explicitly encode geometric and energetic constraints, allowing the generation of receptors that maintain structural stability and realistic interface complementarity [28,70,71,87]. Diffusion-based models trained on crystallographic complexes of TCR–pMHC or antibody–antigen interactions learn the conditional probability distribution of amino-acid arrangements within binding interfaces and can thus propose residues likely to enhance affinity without inducing steric conflict [88,89]. These approaches combine evolutionary information, structural priors, and physical constraints to produce receptor sequences that balance functional novelty with biophysical plausibility.

Antibody and nanobody design have similarly benefited from generative architectures. Protein language models fine-tuned on antibody repertoires capture canonical framework and CDR motifs while permitting targeted diversification of paratopes [90,91]. Diffusion networks have been used to generate entire variable domains consistent with specific antigenic epitopes identified by cryo-electron microscopy or deep mutational scanning [92,93]. Generative sampling across latent manifolds defined by affinity, solubility, and expression metrics enables the creation of variant ensembles optimized for multiple objectives simultaneously. The resulting computationally derived antibodies and TCR mimetics extend the natural immune toolkit toward synthetic precision molecules with programmable binding properties [71,94].

5.2. Modular Optimization of Chimeric Antigen Receptors

Chimeric antigen receptors (CARs) are synthetic constructs that rewire immune recognition into an engineered signaling cascade. Each CAR consists of distinct functional modules; an extracellular binding domain, a hinge and transmembrane region, and one or more intracellular signaling motifs. The performance of a CAR depends on the integrated behavior of these modules, yet empirical optimization through domain swapping and screening is slow and labor-intensive [95]. Generative AI has introduced data-driven strategies capable of exploring this modular design space systematically [96].

Transformer-based models represent CAR components as compositional sequences that encode domain identity, positional order, and contextual interdependencies. By training on curated libraries of CAR constructs linked to phenotypic readouts, these architectures learn how variations in domain composition and arrangement modulate activation thresholds, cytokine signatures, and cellular persistence [97,98]. Conditional generation allows the creation of new CAR configurations optimized for desired functional signatures, such as reduced tonic signaling or enhanced metabolic fitness [99,100].

Reinforcement learning and active-learning algorithms further refine this design process. In these frameworks, model predictions are iteratively updated using experimental feedback from high-throughput CAR screening platforms, enabling convergence toward optimal constructs [101,102,103]. Such feedback loops have already produced CARs with modified hinge lengths and co-stimulatory domain combinations that yield improved cytotoxic performance and diminished exhaustion markers.

Generative modeling also supports the design of armored CARs, which incorporate payload modules such as cytokine secretion cassettes, chemokine receptors, or immune-checkpoint inhibitors. By embedding these additional modules within the same representational space, AI models can co-optimize receptor binding and paracrine modulation, resulting in constructs tailored for hostile tumor microenvironments [104,105]. Collectively, these developments illustrate how generative frameworks convert CAR engineering from heuristic assembly into a rational optimization problem solvable by machine learning.

5.3. Engineering Logic-Gated and Multiplexed Architectures

A further evolution of receptor design involves the encoding of logical operations into immune constructs. Logic-gated CARs and TCRs employ multi-antigen recognition to refine specificity and reduce off-target cytotoxicity [106]. Generative modeling enables systematic exploration of these multi-input architectures by representing antigens, linkers, and signaling modules within a shared latent space. Conditional diffusion or variational models can generate dual-specific binding domains whose cooperative interactions produce Boolean outcomes such as AND, OR, or NOT responses depending on antigen co-expression patterns [107].

These models facilitate computational optimization of interdomain spacing, linker composition, and binding affinity ratios required for balanced activation. By simulating dose–response landscapes across predicted antigen concentrations, AI systems can identify configurations that achieve strong tumor selectivity while sparing healthy tissues [2,108]. In addition, machine-learning-guided sampling of co-stimulatory domain combinations enables fine-tuning of intracellular signaling strength and timing [109,110]. Such multi-objective optimization integrates molecular recognition with system-level control, extending the reach of generative immunoengineering beyond molecular design to programmable cellular logic.

Multiplexed receptor architectures, which incorporate multiple signaling channels within a single cell, also benefit from generative approaches. Models trained on combinatorial libraries of bispecific or tandem CARs learn statistical mappings between module composition and functional synergy [111,112]. The ability to generate thousands of candidate architectures in silico and evaluate them through predictive scoring significantly accelerates discovery, particularly for solid-tumor targets that require simultaneous recognition of multiple antigens.

5.4. Integrating Computational Design with Experimental Validation

The practical utility of generative receptor design depends on its integration with empirical validation. High-throughput display technologies, including yeast, phage, and mammalian systems, provide experimental evidence that grounds model predictions [113]. Single-cell transcriptomic and proteomic profiling captures downstream functional outcomes and supplies data for retraining generative models through active learning [114,115].

These advances have given rise to adaptive DBTL frameworks, in which computational generation and laboratory experimentation are connected through iterative feedback. Within such closed-loop systems, generative models propose receptor or construct candidates, automated biofoundries synthesize and screen them, and the resulting empirical data are reintegrated to update model priors. This adaptive coupling between in silico inference and in vitro validation forms the operational core of generative immunoengineering.

Closed-loop pipelines are emerging in which model-generated receptor or construct libraries are synthesized, expressed, and screened automatically. The resulting activity and expression data are re-entered into the model to update its priors, progressively improving generative accuracy [116]. These DBTL-driven feedback cycles enable continuous hypothesis generation, experimental execution, and model refinement to occur as a coordinated, self-correcting process under human oversight. This iterative design–build–test–learn cycle parallels the self-optimization processes characteristic of control theory and enables rapid convergence toward functional solutions.

Molecular-dynamics simulations and energy-based filtering further ensure that generated sequences respect physical constraints [117]. Structural relaxation, solvent accessibility analysis, and free-energy estimation help eliminate unstable or non-functional designs before synthesis [118]. When combined with automated DNA assembly and cell-based assays, these computational safeguards reduce experimental cost and improve hit rates. Collectively, these elements establish a hybrid experimental–computational ecosystem in which the DBTL cycle functions as the organizing principle linking algorithmic design to biological realization.

The convergence of computational and experimental pipelines also facilitates reproducibility and transparency. Standardized data formats, metadata capture, and open benchmarking of generative models are enabling comparative evaluation across laboratories [55,119]. These practices are essential for establishing trust in AI-generated constructs and for supporting regulatory assessment of algorithmically designed therapeutics.

Overall, the integration of generative modeling, reinforcement optimization, and closed-loop DBTL validation defines a coherent framework for immune receptor and construct design. The transition from empirical mutagenesis to algorithmic generation compresses discovery timelines while expanding the accessible design space. As generative models increasingly integrate multimodal data linking molecular architecture to cellular outcomes, receptor design and phenotype programming are beginning to merge. This convergence marks the next stage of generative immunoengineering, where molecular construction and cellular behavior are optimized within a unified, adaptive learning system.

6. Programming Cell Phenotypes with Generative Models

The capacity to engineer receptors and signaling modules has redefined the molecular architecture of immune cells, but the next frontier of generative design extends beyond receptor composition to the regulation of cellular phenotype [58,120]. Immune function is not determined solely by receptor specificity but by the emergent states of activation, metabolism, and gene regulation that arise within complex intracellular networks [1]. Generative models are now being adapted to learn and manipulate these higher-order regulatory landscapes, enabling algorithmic control of differentiation, persistence, and functional polarization.

6.1. Modeling the Cellular State Space

Immune cells exist within a high-dimensional state space defined by transcriptional, epigenetic, and metabolic variables that evolve dynamically in response to environmental stimuli. Traditional analytical frameworks, such as clustering or trajectory inference, describe these states retrospectively but do not predict how they can be reprogrammed [121,122]. Generative models such as variational autoencoders (VAEs), diffusion probabilistic models, and generative adversarial networks (GANs) provide a fundamentally different capability: they learn the underlying probability distribution of cellular states and can interpolate or sample from this learned manifold to predict unseen or engineered phenotypes [123,124].

When trained on large-scale single-cell RNA sequencing (scRNA-seq) or ATAC-seq datasets, VAEs capture latent variables that correspond to biological processes such as activation, exhaustion, or memory differentiation [125,126,127]. These latent representations can be manipulated to simulate transcriptional reprogramming trajectories. For instance, altering specific latent dimensions can emulate transitions from naïve to effector or from effector to exhausted states, revealing regulatory dependencies that govern these transitions. The learned latent manifold effectively approximates the probabilistic topology of the immune cell-state landscape, providing a computational analog to Waddington’s epigenetic landscape but learned directly from data. Diffusion-based frameworks extend this capacity by modeling the stochastic evolution of gene-expression profiles, providing a generative account of cell-state dynamics over pseudo-temporal trajectories [128,129].

In parallel, multimodal models that integrate transcriptomic, proteomic, and metabolomic features are beginning to describe the coupled regulation of gene expression and metabolism in activated immune cells. Such models allow the generation of hypothetical phenotypes characterized by specific metabolic adaptations, cytokine secretion profiles, or migratory capacities [130,131]. By conditioning on environmental variables such as hypoxia, nutrient availability, or cytokine gradients, these frameworks can simulate how immune cells would adapt under diverse microenvironmental conditions. However, the accuracy of such simulations remains constrained by the completeness and batch-corrected quality of training data, emphasizing the ongoing need for harmonized multimodal datasets.

6.2. Generative Reprogramming and Perturbation Modeling

The translation of generative modeling from descriptive to prescriptive use involves connecting latent dimensions to actionable molecular interventions. Perturbation-based training strategies, such as those used in models like scGen and CPA (Compositional Perturbation Autoencoder), learn mappings between control and perturbed cellular states across thousands of experimental manipulations [132,133]. These models can then generate counterfactual predictions of how a given perturbation—such as gene knockout, cytokine exposure, or small-molecule treatment—would reprogram cellular transcriptional and proteomic profiles.

When coupled with CRISPR perturb-seq data, generative perturbation models identify minimal sets of transcriptional regulators whose modulation is predicted to produce desired phenotypic outcomes. This framework has been applied to predict reprogramming strategies that induce T-cell memory phenotypes or reverse exhaustion-associated transcriptional signatures [134,135,136]. In macrophages and dendritic cells, similar models have been used to explore how combinations of signaling inputs reshape inflammatory versus tolerogenic polarization. These predictions can then guide targeted interventions using synthetic circuits, small molecules, or genome editing [137]. It is important to note that predictive fidelity depends on the coverage of the training manifold; models extrapolate reliably only within data-supported regions of perturbational space.

Integrating these models with reinforcement learning, further enables iterative optimization of intervention strategies. The algorithm explores a combinatorial action space of perturbations and uses feedback from simulated outcomes to propose the most effective intervention sequences. This approach transforms cellular reprogramming into an optimization problem solvable by artificial intelligence, allowing dynamic control of gene-regulatory networks rather than static modification of single targets.

6.3. Linking Generative Models to Synthetic Circuits

Generative modeling also provides a computational substrate for the design of synthetic gene circuits that enact desired cell-state transitions [138,139]. Once a target phenotype is defined in latent space such as resistance to exhaustion, enhanced persistence, or altered cytokine balance, AI models can identify candidate regulatory motifs or signaling pathways that need to be modulated to achieve that state. In this context, regulatory motifs may refer to either cis-regulatory DNA elements controlling transcriptional logic or dynamic feedback structures within signaling networks, depending on the level of abstraction. Synthetic biologists can then construct corresponding genetic circuits to implement these predicted control strategies.

For example, reinforcement learning coupled with gene-network simulations has been used to design circuit architectures that stabilize T-cell metabolic fitness by dynamically regulating glycolytic and oxidative pathways [140,141]. Diffusion-based generators trained on transcriptional responses to immune checkpoints have proposed feedback modules that mitigate activation-induced exhaustion [30,142]. Such designs translate the statistical regularities learned by generative models into actionable biological logic, closing the gap between abstract representation and physical implementation.

The integration of generative models with experimental libraries of promoters and enhancer elements further enables data-driven optimization of regulatory sequences. By learning mappings between sequence composition and expression amplitude or inducibility, generative models can propose synthetic regulatory elements that achieve precise transcriptional tuning within engineered immune cells [143]. This capability is especially valuable for balancing effector potency and safety in next-generation CAR-T or TCR-engineered therapies, where overactivation or premature exhaustion can compromise efficacy.

6.4. Toward Closed-Loop Phenotype Design

A defining feature of generative phenotype modeling is the potential for closed-loop optimization in which computational predictions are continuously refined through empirical feedback [144]. Integration with high-throughput perturbation platforms, time-lapse imaging, and multi-omic profiling allows real-time assessment of how engineered interventions reshape cellular states [145]. Data from each iteration is used to update generative priors, improving accuracy and adaptability.

This closed-loop paradigm parallels the design–build–test–learn cycles established for molecular engineering but operates at the systems level of cellular behavior [146,147]. Extending the molecular DBTL framework to the cellular systems level enables iterative refinement of both molecular components and emergent phenotypes within a unified feedback architecture. In such frameworks, the model functions as a control algorithm that continuously adjusts interventions to maintain desired phenotypic states. Implementing these adaptive DBTL frameworks requires standardized data formats and interoperable experimental protocols to ensure reproducibility and safe automation. These systems could eventually form the basis of autonomous adaptive immunoengineering platforms in which AI proposes genetic or pharmacological modifications, the laboratory executes them through automated microfluidic experimentation, and the resulting data retrain the model in real time.

Implementing such feedback architecture requires not only computational sophistication but also standardized experimental protocols and interoperable data formats. Advances in laboratory automation, robotic culture systems, and real-time single-cell monitoring are making these integrations feasible. As models become capable of predicting the dynamic responses of engineered immune cells, phenotype programming may evolve from a trial-and-error discipline into a continuous adaptive optimization process [148].

The use of generative models to program immune cell phenotypes represents a conceptual expansion of synthetic immunology. It extends the logic of receptor and construct design into the realm of dynamic cellular behavior. By learning from the multidimensional data that describe activation, differentiation, and adaptation, AI systems can propose intervention strategies that achieve desired phenotypic equilibria with minimal experimental iteration [149]. The resulting convergence of generative modeling, perturbation analysis, and synthetic circuit design transforms cellular reprogramming from a descriptive science into a predictive and creative enterprise.

In this emerging framework, immune cells are no longer passive recipients of engineered receptors but programmable entities whose behavior can be sculpted algorithmically. This capacity to generate and stabilize beneficial phenotypes, whether through transcriptional modulation, metabolic rewiring, or synthetic gene networks, constitutes a central milestone on the path toward programmable immunity (Figure 2).

7. The Design–Build–Test–Learn Loop at Scale

The maturation of generative immunoengineering depends not only on algorithmic sophistication but on the integration of computation, automation, and experimentation within a unified feedback architecture. The DBTL loop formalizes this integration as a recursive cycle in which hypotheses generated by artificial intelligence are iteratively realized, evaluated, and used to retrain the model [147,150]. At scale, DBTL transforms immunoengineering from a sequential workflow into a continuously adaptive system that fuses discovery, validation, and optimization into a single process.

7.1. The Design Phase: Generative Hypothesis Formation

In the generative paradigm, design constitutes a computational experiment in which the model explores the probability distribution of biological functions. Large foundation models trained on multi-omic, and structural datasets generate receptor sequences, circuit architectures, or cell-state perturbations that satisfy defined objective functions such as affinity, stability, specificity, metabolic resilience, and manufacturability [151,152].

Multi-objective reinforcement learning (MORL) and Pareto-front optimization are increasingly used to balance these criteria, ensuring that improvements in one property do not compromise another. For example, reinforcement agents can adjust generative sampling to favor constructs that maintain predicted folding stability while maximizing target binding and minimizing immunogenic epitopes [94,153,154]. Bayesian optimization frameworks quantify uncertainty across latent dimensions, guiding exploration toward regions of design space where the model’s confidence is low but potential reward is high.

At the cellular level, generative models design intervention strategies that reprogram gene-regulatory networks or metabolic fluxes to achieve stable phenotypes. These designs may take the form of predicted transcription-factor combinations, circuit topologies, or epigenetic modifications [155,156]. In silico simulations using agent-based or ODE-based digital twins of immune cells allow evaluation of predicted designs before synthesis, effectively providing pre-experimental validation within the design phase itself [157]. This digital pre-screening step serves as an internal “virtual test phase,” reducing wet-lab load while enhancing safety and design traceability within the DBTL framework.

7.2. The Build Phase: Automated Synthesis and Cellular Integration

The build phase converts digital designs into tangible biological constructs. Modern biofoundries employ modular, high-throughput synthesis pipelines that integrate robotic liquid handling, automated cloning, and barcoded sample tracking. DNA assembly methods such as Golden Gate, Gibson, and enzymatic ligation-independent cloning enable parallel production of thousands of constructs in standardized vectors [158,159].

In immunoengineering, the build step involves integrating these synthetic constructs into cellular systems. CRISPR/Cas, base-editing, and transposon-mediated delivery methods allow targeted insertion of designed sequences into immune-cell genomes, often at safe-harbor loci that permit consistent expression [160]. Microfluidic electroporation and viral-vector platforms have been optimized for parallel processing of primary T or NK cells, enabling libraries of engineered variants to be generated under controlled conditions [161]. Each design instance is annotated with its origin, parameters, and vector architecture, allowing traceable linkage between computational proposal and biological realization. This traceability is critical for both reproducibility and regulatory compliance, ensuring transparent lineage from digital design to physical construct.

Emerging cell-free systems provide an intermediate validation layer. DNA templates or mRNA constructs can be expressed in vitro to assay folding, binding, or signaling activity before introduction into living cells [162,163]. These rapid screening layers reduce the cost and biosafety burden of testing AI-generated sequences. Integration with laboratory-information management systems (LIMS) ensures that metadata, sequence provenance, and performance metrics flow seamlessly back into the digital design environment [164].

7.3. The Test Phase: High-Dimensional and Multiscale Evaluation

Testing represents the sensory layer of the DBTL system, translating experimental outcomes into quantitative metrics for model refinement. High-throughput display systems—yeast, phage, or mammalian—provide initial binding and expression readouts for receptor libraries [77]. Flow cytometry and surface plasmon resonance quantify affinity and kinetic constants, while single-cell assays capture functional endpoints such as cytokine release, proliferation, or exhaustion markers [165].

Recent developments in multi-omic screening have expanded the granularity of testing. Single-cell RNA sequencing, proteomic barcoding, and metabolomic profiling characterize thousands of engineered cells simultaneously, revealing how synthetic constructs reshape global cellular states [166]. Spatial transcriptomics and live-cell imaging provide contextual information about cell–cell interactions, trafficking, and synapse formation.

These high-dimensional data streams are analyzed through unsupervised embedding and graph-based clustering to extract latent features representing functional archetypes. Statistical coupling between design parameters and phenotypic readouts allows causal inference about which molecular features drive performance. Importantly, uncertainty quantification metrics guide the selection of candidates for deeper mechanistic analysis, ensuring that the test phase not only validates designs but also enriches the informational value of subsequent learning cycles [167]. Incorporating Bayesian calibration and explainable modeling frameworks can further ensure that performance gains are mechanistically interpretable rather than purely correlational.

7.4. The Learn Phase: Model Updating and Active Reinforcement

The learn phase closes the feedback loop by converting empirical results into updated model parameters. Instead of static retraining, modern systems employ online learning architectures in which models ingest experimental data in near real time [102,168]. Each new data batch adjusts the model’s latent embeddings, probability weights, and uncertainty estimates, progressively aligning computational predictions with biological reality.

Active learning strategies determine which experiments would most efficiently reduce model uncertainty. The algorithm selects a subset of candidates predicted to yield the highest information gain, focusing experimental resources on the most informative regions of design space [6]. Reinforcement learning further couples the model to the physical system: successful experimental outcomes increase reward signals that bias subsequent generative sampling toward productive directions.

In practice, these adaptive feedback create a cyber-physical learning organism. An integrated system in which computational and biological components co-evolve. Each iteration not only refines the model’s internal representation but also generates new empirical priors that expand its capacity to generalize [169]. The result is exponential acceleration of discovery efficiency, with each loop yielding designs of higher predicted performance and lower variance between simulation and experiment. This iterative refinement embodies a data-driven analogue of biological evolution (variation, selection, and retention) executed within an AI-governed experimental ecosystem.

7.5. Automation, Data Infrastructure, and Self-Optimizing Biofoundries

At scale, DBTL becomes inseparable from automation. Modern biofoundries combine robotics, microfluidics, and advanced data orchestration to enable continuous closed-loop experimentation. AI design servers communicate directly with robotic assembly lines through standardized APIs, initiating synthesis and testing sequences without manual intervention [170,171]. Real-time sensor data—temperature, reagent usage, cell viability, expression levels—are streamed to cloud infrastructures that synchronize model updates.

Digital twins of the laboratory simulate physical processes in silico, allowing predictive scheduling, error correction, and adaptive re-prioritization of experimental tasks. These twins maintain a live correspondence between virtual and real experiments, permitting instantaneous recalibration when deviations occur [172]. Integration with edge-computing modules enables local decision-making, reducing latency between data acquisition and design refinement.

Data interoperability is central to scaling. Standardized ontologies (e.g., SBOL, AnIML, MIFlowCyt) and metadata schemas ensure that information from diverse instruments and facilities can be aggregated for cross-institutional learning. Cloud-native data lakes equipped with version control and provenance tracking store raw and processed datasets, supporting reproducibility and regulatory auditing [173,174]. Standardization of both data semantics and experiment-level metadata remains a bottleneck, underscoring the importance of community-driven interoperability initiatives.

As these infrastructures mature, self-optimizing biofoundries are emerging facilities where generative AI orchestrates the entire pipeline from molecular design to functional evaluation. Such systems can autonomously evolve improved receptor variants, optimize circuit architectures, and fine-tune culture parameters to enhance yield or stability [175]. Over time, the accumulated data acts as an institutional memory, enabling transfer learning across projects and the continuous improvement of both algorithms and experimental protocols.

7.6. Integrative and Translational Implications

The large-scale implementation of DBTL frameworks in immunoengineering signifies a structural reorganization of how biological knowledge is produced. Instead of discrete projects defined by static hypotheses, research becomes a dynamic optimization process governed by real-time feedback [176,177]. Generative models no longer operate as isolated analytical tools but as components of an evolving experimental ecosystem.

At the translational level, scalable DBTL systems accelerate the path from computational concept to clinical candidate. By systematically linking receptor sequence, cell phenotype, and manufacturing parameters, these platforms can identify predictive markers of efficacy and safety early in development [178,179]. Closed-loop optimization also supports adaptive manufacturing, where process parameters are adjusted algorithmically to maintain product quality in response to real-time analytics.

Ultimately, the convergence of generative AI, automation, and scalable experimentation transform immunoengineering into a continuously learning infrastructure. Each iteration expands the collective intelligence encoded in both digital models and biological systems, gradually approaching a regime where the boundaries between designing immunity, testing immunity, and learning immunity become indistinguishable. This architecture represents the operational foundation of programmable immunity, translating theoretical possibility into a self-refining experimental reality. In this architecture, the DBTL framework functions not merely as an engineering workflow but as a new epistemology for biological design—one where knowledge generation, model evolution, and therapeutic innovation proceed as an inseparable continuum.

8. Translational Opportunities and Clinical Outlook

Generative immunoengineering represents a fundamental reorganization of how cell-based therapies are conceived, tested, and manufactured. What was once a linear sequence—spanning discovery, optimization, and production is becoming an adaptive continuum in which computation, experimentation, and clinical translation are tightly coupled through iterative feedback [59]. Within this architecture, the immune system is no longer treated solely as a biological entity to be modulated but as a programmable substrate whose molecular and cellular functions can be designed, validated, and continuously refined through artificial intelligence. This conceptual shift recasts translational medicine as a dynamic learning process, one in which biology and computation evolve in synchrony. This paradigm reframes therapeutic development as a bidirectional learning system in which clinical data, molecular modeling, and experimental outcomes continuously inform one another, effectively creating a feedback-coupled translational pipeline.

At the foundation of this transformation is the capacity of generative models to accelerate discovery and optimization with unprecedented efficiency. Large-scale protein and cellular language models trained on structural, sequence, and binding-affinity data can propose millions of receptor or circuit variants that satisfy pre-defined biophysical and functional constraints. Multi-objective optimization—combining Bayesian inference, reinforcement learning, and evolutionary search—allows competing design criteria such as affinity, folding stability, and manufacturability to be reconciled within a single probabilistic framework [94,152]. When coupled to high-throughput synthesis and functional screening, these algorithms transform receptor design from an empirical search into a statistically guided exploration of sequence space. Early analyses suggest that such workflows can reduce the experimental burden by an order of magnitude, compressing timelines that once spanned years into a matter of weeks, though such acceleration remains contingent on access to high-quality multimodal training data and harmonized experimental standards, while preserving, and in some cases improving success rates in identifying viable therapeutic constructs.

Beyond speed, the generative paradigm introduces the possibility of genuine personalization. Patient-derived molecular data including tumor transcriptomes, HLA genotypes, immune-repertoire sequencing, and single-cell multi-omics can serve as conditioning variables for model inference [149]. This enables the generation of individualized T-cell receptors or chimeric antigen receptors that are predicted to engage patient-specific neoantigens while minimizing self-reactivity. In principle, these digital blueprints can be synthesized directly into autologous lymphocytes or natural-killer cells within closed, automated manufacturing systems [180]. Crucially, the same framework permits adaptive therapy, continuous molecular monitoring, through circulating tumor DNA, antigen-escape profiling, or cytokine dynamics can be fed back to retraining models and update constructs in response to disease evolution. This continuous feedback loop effectively transforms treatment into a dynamic control process, aligning therapeutic pressure with tumor or immune-escape kinetics in near real time. Therapy thus becomes a moving equilibrium, a co-adaptive process in which the treatment learns from the patient as much as the patient responds to the treatment [181].

Realizing this adaptive vision requires new regulatory and clinical-trial architecture. Current frameworks presuppose fixed molecular entities, yet generative therapeutics are inherently dynamic. Regulatory bodies have begun developing guidance under the principles of Good Machine Learning Practice, emphasizing transparency, explainability, and post-deployment surveillance. Adaptive or platform trials may replace static designs, allowing algorithmically derived construct revisions within predefined boundaries. Version-controlled documentation of model parameters, training data, and validation outcomes will constitute an “algorithmic dossier,” analogous to the chemistry-manufacturing-controls documentation required for biologics [182,183]. Post-market oversight is expected to include continuous monitoring for model drift and periodic re-certification of retrained algorithms. Together, these mechanisms will form the basis of a new discipline, regulatory bioinformatics, dedicated to the governance of learning systems in medicine. Such governance will likely integrate algorithmic explainability metrics, model-card disclosures, and standardized digital audit trails to ensure accountability throughout the therapeutic life cycle.

Translation from digital design to clinical-grade production further depends on automation and digital infrastructure. AI-integrated biofoundries now link computational design servers directly to robotic assembly, viral-vector packaging, and closed-system cell expansion, creating an unbroken digital thread from algorithmic proposal to physical manufacture. Each construct carries a persistent identifier linking its computational origin to its production batch, ensuring traceability across the product life cycle [184]. Digital-twin bioreactors simulate nutrient gradients, cytokine signaling, and metabolic flux in real time, adjusting culture conditions through reinforcement-learning controllers to preserve cell viability and phenotypic stability. Multi-omic sensors monitoring transcriptomic, impedance, and metabolic signatures feed continuous data into these control layers, allowing predictive correction of deviations before product quality is compromised. Such cyber-physical feedback transforms Good Manufacturing Practice environments from static production lines into adaptive learning systems that improve with every run [185,186]. Collectively, these infrastructures redefine Good Manufacturing Practice (GMP) as a dynamic rather than static framework, one in which quality is maintained through continuous sensing, prediction, and correction rather than retrospective testing.

Although oncology remains the initial testing ground, the principles of generative immunoengineering are applicable across diverse therapeutic landscapes. In solid tumors, generative models are being used to design multispecific and logic-gated CAR architectures that improve selectivity and persistence while mitigating off-tumor toxicity. In autoimmunity, the same computational logic can be applied to the design of regulatory-T-cell or dendritic-cell circuits that restore immune tolerance without systemic suppression. Regenerative medicine may benefit from macrophage or tolerogenic antigen-presenting-cell designs optimized for cytokine balance and metabolic resilience, promoting tissue repair and graft acceptance (Table 2) [50,96]. Similar approaches extend to infectious-disease preparedness, where rapid, model-driven updates to antigen or receptor design could allow immune interventions to evolve as quickly as the pathogens they target.

Ensuring the safety and interpretability of algorithmically derived constructs remains paramount. Multi-layered validation pipelines combine in silico prediction, explainable-AI analysis, and empirical verification. Generative outputs are screened for immunogenic motifs, structural instability, and potential off-target binding before synthesis. These in-silico safeguards are complemented by human-in-the-loop review protocols to prevent over-reliance on automated predictions in high-risk therapeutic contexts. Attention-based visualization and feature-attribution mapping identify sequence regions most influential in the model’s predictions, while uncertainty quantification provides calibrated confidence estimates. Empirical assays ranging from multiplex peptide arrays to single-cell cytotoxicity screens serve as orthogonal tests of computational accuracy [196,197]. To formalize accountability, proposed standards for Algorithmic Documentation Files will record the architecture, data provenance, and performance metrics of each deployed model, enabling reproducibility and auditability across regulatory jurisdictions [198].

Economic and infrastructural considerations are equally transformative. The high up-front computational investment in data curation and model training is counterbalanced by substantial downstream savings from reduced screening, accelerated iteration, and automated manufacturing. Distributed biofoundries connected through secure cloud infrastructures could enable regional or hospital-based production of autologous therapies, reducing logistic complexity and cold-chain dependency. Achieving this vision will require harmonized digital quality-management systems and interoperable GMP documentation across sites. Ensuring equitable access to these infrastructures, particularly in low- and middle-income regions will be critical to prevent a widening translational divide in personalized immunotherapy. Health-economic evaluation frameworks must evolve to recognize the amortized value of continuously improving algorithms and the outcomes they enable, transitioning from static cost-effectiveness assessments toward performance-linked reimbursement models [199].

The ethical, legal, and social implications of generative immunoengineering must evolve alongside its technical capabilities. The use of clinical and genomic data for model training necessitates explicit consent frameworks specifying secondary and longitudinal data use. Questions of intellectual property whether ownership resides in datasets, model architectures, or generated sequences require global policy coordination. Dual-use risks are real; systems capable of designing potent therapeutic receptors could, in principle, be misused to generate immune-evasive or pathogenic molecules. Safeguarding measures analogous to existing biosecurity treaties will be essential, as will equitable access to computational infrastructure to prevent concentration of capability within a few technologically privileged centers. Open repositories, distributed compute alliances, and internationally governed consortia can help ensure that the benefits of programmable immunity are shared globally rather than confined to specific regions or sectors [200,201].

Taken together, these translational trajectories delineate the emergence of a new therapeutic paradigm. In the near term, AI-optimized CAR and TCR constructs with integrated safety and quality-monitoring frameworks are poised to enter early-phase trials. Over the next decade, standardized digital biofoundries and adaptive regulatory pipelines are likely to support the extension of this technology to autoimmune, transplant, and regenerative contexts [96]. Beyond that horizon lies the prospect of a continuously learning therapeutic ecosystem in which each patient outcome refines the generative models that guide subsequent design. In this future, therapeutic innovation and biological understanding become inseparable, and the immune system itself is reimagined as a programmable interface between computation and life. The defining hallmark of AI-enabled generative design of immune cells and receptors for programmable immunity [202]. This synthesis of adaptive intelligence and living matter marks not only a technological milestone but a conceptual redefinition of medicine itself, one in which therapy, learning, and evolution converge within a single generative continuum.

9. Ethical, Regulatory, and Societal Implications of Generative Immunoengineering

The capacity to design immune cells and receptors through generative artificial intelligence represents both a profound scientific advance and a consequential ethical turning point. The very features that make this technology transformative; its speed, adaptability, and autonomy, also challenge the traditional mechanisms through which biomedical innovation has been governed. As generative models begin to influence the structure of experimental inquiry, the criteria of clinical validation, and the distribution of therapeutic access, they introduce a new layer of moral and regulatory responsibility. The obligation to design not only biological systems but also the systems of oversight that will ensure their safe and equitable use [203,204]. This dual responsibility to engineer biology and the ethics that govern it defines an emerging field of generative bioethics evolving in parallel with generative biology itself.

At the ethical level, the most immediate concern involves the use of patient-derived data for model training and algorithmic conditioning. The effectiveness of generative immunoengineering depends on access to large, high-quality datasets encompassing genomic, proteomic, and clinical information. Yet the aggregation of such data often derived from identifiable biological samples raises questions about consent, ownership, and longitudinal use. Current consent models, designed for discrete studies, are poorly suited to the continuous learning paradigm of AI-driven research. Dynamic or “evergreen” consent frameworks, allowing participants to renew, modify, or revoke data permissions as models evolve, may become essential to align data use with individual autonomy [205,206,207]. In clinical settings, such adaptive consent must be coupled with ongoing feedback between patients and therapeutic models, ensuring that participants maintain agency as their biological and algorithmic profiles co-evolve. Likewise, new institutional mechanisms will be needed to recognize participants not merely as data donors but as contributors to the generative process, with potential claims to benefit-sharing or acknowledgment. Including under-represented populations in training datasets will also be critical to mitigate demographic bias and ensure global generalizability of AI-driven immune design.

Transparency and explainability constitute a second ethical axis. The interpretability of generative outputs is crucial for both scientific trust and clinical safety. While recent advances in attention mapping, saliency analysis, and uncertainty quantification have improved insight into how models generate biological designs, the epistemic gap between statistical pattern recognition and causal biological reasoning persists. Regulators, clinicians, and researchers must therefore treat algorithmic explainability not as an optional feature but as a moral imperative. Documentation of model lineage, training data provenance, and version history should be considered integral to ethical disclosure, comparable in importance to reporting methods in experimental science [208,209]. Bridging this interpretive gap will require hybrid frameworks that combine mechanistic modeling with data-driven generation so that algorithmic creativity remains biologically intelligible. Without this transparency, the reproducibility crisis that has affected other fields could extend into synthetic immunology, undermining confidence in AI-mediated design.

Regulatory institutions now face the challenge of overseeing entities that are not static products but evolving systems. Conventional approval pathways, designed for fixed molecular entities, cannot easily accommodate therapeutic platforms that learn from new data and autonomously propose novel constructs [210]. Emerging frameworks under the rubric of Good Machine Learning Practice (GMLP) attempt to address this tension by introducing algorithmic change-control mechanisms, documentation standards, and real-time performance monitoring (Figure 3). In the context of generative immunoengineering, such regulation will likely require integration of algorithmic dossiers into the chemistry-manufacturing-controls infrastructure of GMP [211]. Each generative model may be regarded as a “living protocol” subject to continuous validation and regulatory auditing. This convergence of computational oversight and biomanufacturing control represents a defining shift in the governance of biomedical AI and will require cross-trained professionals fluent in both algorithmic governance and GMP compliance.

A further dimension of ethical responsibility arises from dual-use potential and biosecurity. The same generative architectures that optimize immune recognition could, in principle, be repurposed to design immune-evasive pathogens, synthetic toxins, or receptor-binding antagonists. As algorithmic tools diffuse through open scientific ecosystems, the line between beneficial and hazardous application becomes porous. International biosecurity regimes, historically focused on material agents and physical laboratories, must therefore expand to encompass informational biosecurity. the governance of digital models, codebases, and data pipelines capable of generating biological functions. Building multi-layered safeguards that combine technical containment, federated architectures, and ethical licensing will be central to future risk management. Developing secure access frameworks such as controlled model release, tiered licensing, and federated training will be essential to balance open scientific exchange with risk mitigation [204,212]. Ethical oversight committees within research institutions should incorporate expertise in AI safety and cybersecurity alongside traditional biosafety representation.

The societal implications of generative immunoengineering extend beyond bioethics into the political economy of biomedical innovation. Because generative design relies on large computational infrastructure and proprietary datasets, there is a risk that technological capacity and therefore therapeutic opportunity will become concentrated within a small number of institutions and nations. Without deliberate intervention, the “AI divide” in healthcare could mirror and amplify existing inequities in access to genomic medicine. Counteracting this trend will require international coordination of data-sharing standards, open access repositories for pre-trained biological models, and collaborative licensing arrangements that enable low-resource regions to participate in the development and deployment of AI-driven therapeutics [213,214]. Equitable access is not merely a matter of distributive justice but of scientific robustness. Diversity in training data enhances generalizability and reduces model bias, thereby improving global therapeutic safety. Moreover, algorithmic asymmetry in data ownership and computational access risks consolidating economic power among a few institutions, creating new forms of biomedical dependency that demand policy-level correction.

The epistemological implications are equally significant. Generative immunoengineering reconfigures the relationship between hypothesis and experiment, transforming discovery into an iterative dialogue between algorithmic inference and empirical validation [215,216]. This shift blurs the historical boundary between knowledge generation and technological fabrication, forcing a reconsideration of what counts as “understanding” in biology. If a model can design a receptor that functions optimally without the designer fully comprehending the underlying causal grammar, the locus of scientific agency moves from the human investigator to the hybrid system of human and machine. In this configuration, agency becomes distributed, a co-production of human intention and algorithmic inference, raising new questions about authorship, accountability, and epistemic responsibility. Such transformations invite reflection on the nature of explanation, accountability, and authorship in the age of algorithmic biology. The challenge for future scientific culture will be to ensure that interpretability and conceptual insight evolve alongside performance and automation [215].

Finally, the integration of generative immunoengineering into clinical and societal systems will demand new forms of governance that are anticipatory rather than reactive. Ethical frameworks should be embedded from the outset of model development, not retrofitted in response to controversy. Cross-disciplinary oversight bringing together immunologists, data scientists, ethicists, clinicians, and patient representatives should guide decisions about data use, design objectives, and therapeutic deployment. International consortia may serve as coordinating bodies to establish shared principles for algorithmic transparency, data equity, and biosafety. As generative biology becomes increasingly autonomous, the human responsibility for defining its boundaries and purposes becomes more, not less, essential. The future of programmable immunity will thus depend not only on the sophistication of its algorithms but also on the moral architecture of the institutions that steward them [204]. Only through such ethically adaptive governance can programmable immunity mature into a discipline that safeguards both biological integrity and societal trust in the era of generative medicine.

10. Conclusion

The convergence of generative artificial intelligence, cellular engineering, and immunology marks a transformative moment in the life sciences. What began as an attempt to enhance receptor design has matured into a new framework for understanding and shaping biology itself. Through generative modeling, immune repertoires, signaling networks, and cellular phenotypes can now be explored as dynamic design spaces rather than static natural entities. This shift dissolves the traditional boundaries between discovery and fabrication, between observing life and programming it. It redefines biology as an editable, self-informing system, one in which learning becomes a property of both the organism and the methods that study it.

AI-enabled generative immunoengineering creates a continuum that links molecular design, phenotypic programming, and automated biomanufacturing within integrated feedback systems. The design–build–test–learn cycle converts cell therapy development from a linear experimental sequence into a continuously adaptive process. Each iteration strengthens the intelligence of the system by transforming empirical outcomes into computational insight, allowing therapeutic design and biological understanding to co-evolve. In this model, medicine becomes a learning enterprise, guided by algorithms that refine their predictions through every patient and experiment. Such integration heralds the emergence of a “living laboratory” paradigm, where computation, experimentation, and clinical feedback operate as one self-optimizing network.

Translationally, this paradigm redefines both the structure and the pace of biomedical innovation. The ability to generate, validate, and deploy patient-specific immune receptors within compressed timeframes makes precision immunotherapy responsive to individual and population-level dynamics. Automated manufacturing environments equipped with digital twins and real-time analytics promise production systems that not only reproduce validated protocols but improve upon them through continuous optimization. As regulatory frameworks evolve to accommodate algorithmic validation and adaptive approval, the distinction between discovery, manufacturing, and clinical deployment will diminish, giving rise to a self-improving therapeutic ecosystem embedded within healthcare itself. In the long view, this same architecture may extend beyond immunology, linking generative genomics, regenerative medicine, and neural interface design into a unified science of programmable biology.

At the same time, this technological acceleration intensifies questions of ethics, governance, and equity. The capacity to design immunity at will requires oversight systems that are as adaptive as the technologies they regulate. Consent, data ownership, intellectual property, and algorithmic transparency must be treated as integral components of the research architecture rather than external constraints. The future of generative immunoengineering will depend on maintaining equilibrium between innovation and accountability, openness and security, personalization and fairness. The sophistication of our moral and institutional design must keep pace with the sophistication of our computational tools. Ethical foresight must therefore evolve as dynamically as the algorithms themselves, ensuring that creativity and responsibility remain inseparable.

Ultimately, generative immunoengineering invites a new way of thinking about intervention in biology. It envisions a medicine in which intelligence, human and artificial, acts as a creative partner in the shaping of immune function. The concept of programmable immunity captures this synthesis. A vision of therapeutic science that is predictive, personalized, and continuously learning, yet guided by ethical foresight and social responsibility. It signals the beginning of a new epistemology in which the design of life and the design of knowledge become one continuous act. The challenge ahead is not only to design better immune cells but to design wiser systems; scientific, ethical, and societal through which such power can be directed toward the enduring goal of human well-being.

Author Contributions

O.A. and M.N. equally contributed to the design and writing of the main manuscript text.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

Declare conflicts of interest or state “The authors declare no conflict of interest.

References

Navarro Quiroz R, Villarreal Camacho J, Zarate Penata E, et al. Multiscale information processing in the immune system. Front Immunol 2025, 16, 1563992. [Google Scholar] [CrossRef]
Dewaker V, Morya VK, Kim YH, et al. Revolutionizing oncology: the role of Artificial Intelligence (AI) as an antibody design, and optimization tools. Biomark Res 2025, 13, 52. [Google Scholar]
Rafelski SM, Theriot JA Establishing a conceptual framework for holistic cell states and state transitions. Cell 2024, 187, 2633–2651. [CrossRef]
Leary AY, Scott D, Gupta NT, et al. Designing meaningful continuous representations of T cell receptor sequences with deep generative models. Nat Commun 2024, 15, 4271. [Google Scholar] [CrossRef]
Ibrahim M, Khalil YA, Amirrajab S, et al. Generative AI for synthetic data across multiple medical modalities: A systematic review of recent developments and challenges. Comput Biol Med 2025, 189, 109834. [Google Scholar]
Zhang P, Wei L, Li J, et al. Artificial intelligence-guided strategies for next-generation biological sequence design. Natl Sci Rev 2024, 11, nwae343. [Google Scholar] [CrossRef] [PubMed]
Ertelt M, Moretti R, Meiler J, et al. Self-supervised machine learning methods for protein design improve sampling but not the identification of high-fitness variants. Sci Adv 2025, 11, eadr7338. [Google Scholar] [CrossRef]
Krapp LF, Meireles FA, Abriata LA, et al. Context-aware geometric deep learning for protein sequence design. Nat Commun 2024, 15, 6273. [Google Scholar] [CrossRef]
Ma JL, H.; Hu, Y.; Huang, J. Physicochemically Informed Dual-Conditioned Generative Model of T-Cell Receptor Variable Regions for Cellular Therapy. arxiv, 2025. [Google Scholar]
Dolorfino M, Samanta R, Vorobieva A. ProteinMPNN Recovers Complex Sequence Properties of Transmembrane beta-barrels. bioRxiv 2024.
Chouleur T, Etchegaray C, Villain L, et al. A strategy for multimodal integration of transcriptomics, proteomics, and radiomics data for the prediction of recurrence in patients with IDH-mutant gliomas. Int J Cancer 2025, 157, 573–587. [Google Scholar] [CrossRef]
Berson E, Chung P, Espinosa C, et al. Unlocking human immune system complexity through AI. Nat Methods 2024, 21, 1400–1402. [Google Scholar] [CrossRef] [PubMed]
Gao S, Fang A, Huang Y, et al. Empowering biomedical discovery with AI agents. Cell 2024, 187, 6125–6151. [Google Scholar] [CrossRef] [PubMed]
O’Donnell TJ, Kanduri C, Isacchini G, et al. Reading the repertoire: Progress in adaptive immune receptor analysis using machine learning. Cell Syst 2024, 15, 1168–1189. [Google Scholar] [CrossRef]
Liu YT, Zhang LL, Jiang ZY, et al. Applications of Artificial Intelligence in Biotech Drug Discovery and Product Development. MedComm (2020) 2025, 6, e70317. [Google Scholar]
Katoh H, Komura D, Furuya G, et al. Immune repertoire profiling for disease pathobiology. Pathol Int 2023, 73, 1–11. [Google Scholar] [CrossRef] [PubMed]
Ou Y, Guo S Safety risks and ethical governance of biomedical applications of synthetic biology. Front Bioeng Biotechnol 2023, 11, 1292029.
Kong Y, Li J, Zhao X, et al. CAR-T cell therapy: developments, challenges and expanded applications from cancer to autoimmunity. Front Immunol 2024, 15, 1519671. [Google Scholar]
de Greef PC, Oakes T, Gerritsen B, et al. The naive T-cell receptor repertoire has an extremely broad distribution of clone sizes. Elife 2020, 9.
Tsahouridis O, Xu M, Song F, et al. The landscape of CAR-engineered innate immune cells for cancer immunotherapy. Nat Cancer 2025, 6, 1145–1156. [Google Scholar] [CrossRef]
Gangwal A, Ansari A, Ahmad I, et al. Generative artificial intelligence in drug discovery: basic framework, recent advances, challenges, and opportunities. Front Pharmacol 2024, 15, 1331062. [Google Scholar] [CrossRef]
Johnson JAI, Bergman DR, Rocha HL, et al. Human interpretable grammar encodes multicellular systems biology models to democratize virtual cell laboratories. Cell 2025, 188, 4711–4733. [Google Scholar] [CrossRef]
Weber CR, Rubio T, Wang L, et al. Reference-based comparison of adaptive immune receptor repertoires. Cell Rep Methods 2022, 2, 100269. [Google Scholar] [CrossRef] [PubMed]
Bjerregaard A, Groth PM, Hauberg S, et al. Foundation models of protein sequences: A brief overview. Curr Opin Struct Biol 2025, 91, 103004. [Google Scholar]
Fox DR, Taveneau C, Clement J, et al. Code to complex: AI-driven de novo binder design. Structure 2025, 33, 1631–1642. [Google Scholar] [CrossRef]
Wang M, Patsenker J, Li H, et al. Supervised fine-tuning of pre-trained antibody language models improves antigen specificity prediction. PLoS Comput Biol 2025, 21, e1012153. [Google Scholar]
Chen LT, Quinn Z, Dumas M, et al. Target sequence-conditioned design of peptide binders using masked language modeling. Nat Biotechnol 2025.
Gasser HC, Rajan A, Alfaro JA A novel decoding strategy for ProteinMPNN to design with less visibility to cytotoxic T-lymphocytes. Comput Struct Biotechnol J 2025, 27, 3693–3703. [CrossRef]
Mei A, Letscher KP, Reddy S Engineering next-generation chimeric antigen receptor-T cells: recent breakthroughs and remaining challenges in design and screening of novel chimeric antigen receptor variants. Curr Opin Biotechnol 2024, 90, 103223.
Zhang J, Che Y, Liu R, et al. Deep learning-driven multi-omics analysis: enhancing cancer diagnostics and therapeutics. Brief Bioinform 2025, 26. [Google Scholar]
Sandberg TE, Salazar MJ, Weng LL, et al. The emergence of adaptive laboratory evolution as an efficient tool for biological discovery and industrial biotechnology. Metab Eng 2019, 56, 1–16. [Google Scholar] [CrossRef]
Zhu L, Mou W, Hong C, et al. The Evaluation of Generative AI Should Include Repetition to Assess Stability. JMIR Mhealth Uhealth 2024, 12, e57978. [Google Scholar] [CrossRef] [PubMed]
Mirakhori F, Niazi SK Harnessing the AI/ML in Drug and Biological Products Discovery and Development: The Regulatory Perspective. Pharmaceuticals (Basel) 2025, 18.
Teng F, Cui T, Zhou L, et al. Programmable synthetic receptors: the next-generation of cell and gene therapies. Signal Transduct Target Ther 2024, 9, 7. [Google Scholar] [CrossRef]
Weissenow K, Rost B Are protein language models the new universal key? Curr Opin Struct Biol 2025, 91, 102997.
Chen Y, Wang Z, Zeng X, et al. Molecular language models: RNNs or transformer? Brief Funct Genomics 2023, 22, 392–400. [Google Scholar] [CrossRef] [PubMed]
Brandes N, Ofer D, Peleg Y, et al. ProteinBERT: a universal deep-learning model of protein sequence and function. Bioinformatics 2022, 38, 2102–2110. [Google Scholar] [CrossRef]
Alley EC, Khimulya G, Biswas S, et al. Unified rational protein engineering with sequence-based deep representation learning. Nat Methods 2019, 16, 1315–1322. [Google Scholar] [CrossRef]
Lupo U, Sgarbossa D, Bitbol AF Protein language models trained on multiple sequence alignments learn phylogenetic relationships. Nat Commun 2022, 13, 6298. [CrossRef] [PubMed]
Lin Z, Akin H, Rao R, et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 2023, 379, 1123–1130. [Google Scholar] [CrossRef]
Chen Y, Xu Y, Liu D, et al. An end-to-end framework for the prediction of protein structure and fitness from single sequence. Nat Commun 2024, 15, 7400. [Google Scholar] [CrossRef] [PubMed]
Wang M, Patsenker J, Li H, et al. Language model-based B cell receptor sequence embeddings can effectively encode receptor specificity. Nucleic Acids Res 2024, 52, 548–557. [Google Scholar] [CrossRef] [PubMed]
Justyna M, Zirbel C, Antczak M, et al. Graph neural network and diffusion model for modeling RNA interatomic interactions. Bioinformatics 2025, 41. [Google Scholar] [CrossRef]
Soleymani F, Paquet E, Viktor HL, et al. Structure-based protein and small molecule generation using EGNN and diffusion models: A comprehensive review. Comput Struct Biotechnol J 2024, 23, 2779–2797. [Google Scholar] [CrossRef]
Nunes FVM, Behrens LMP, Weimer RD, et al. Deep learning methods and applications in single-cell multimodal data integration. Mol Omics 2025.
Liu 刘俊伟 J, Cen 岑萧萍 X, Yi 伊晨昕 C, et al. Challenges in AI-driven Biomedical Multimodal Data Fusion and Analysis. Genomics Proteomics Bioinformatics 2025, 23. [Google Scholar]
Wu X, Yang X, Dai Y, et al. Single-cell sequencing to multi-omics: technologies and applications. Biomark Res 2024, 12, 110. [Google Scholar]
Lamiable A, Champetier T, Leonardi F, et al. Revealing invisible cell phenotypes with conditional generative modeling. Nat Commun 2023, 14, 6386. [Google Scholar] [CrossRef]
Agarwal AA, Harrang J, Noble D, et al. AlphaBind, a domain-specific model to predict and optimize antibody-antigen binding affinity. MAbs 2025, 17, 2534626. [Google Scholar] [CrossRef]
Karthikeyan D, Bennett SN, Reynolds AG, et al. Conditional generation of real antigen-specific T cell receptor sequences. Nat Mach Intell 2025, 7, 1494–1509. [Google Scholar] [CrossRef]
Lawton ML, Inge MM, Blum BC, et al. Multiomic profiling of chronically activated CD4+ T cells identifies drivers of exhaustion and metabolic reprogramming. PLoS Biol 2024, 22, e3002943. [Google Scholar]
Iu DS, Maya J, Vu LT, et al. Transcriptional reprogramming primes CD8+ T cells toward exhaustion in Myalgic encephalomyelitis/chronic fatigue syndrome. Proc Natl Acad Sci U S A 2024, 121, e2415119121. [Google Scholar] [CrossRef] [PubMed]
Hudson D, Fernandes RA, Basham M, et al. Can we predict T cell specificity with digital biology and machine learning? Nat Rev Immunol 2023, 23, 511–521. [Google Scholar] [CrossRef]
Leggieri PA, Liu Y, Hayes M, et al. Integrating Systems and Synthetic Biology to Understand and Engineer Microbiomes. Annu Rev Biomed Eng 2021, 23, 169–201. [Google Scholar] [CrossRef] [PubMed]
Leipzig J, Nust D, Hoyt CT, et al. The role of metadata in reproducible computational research. Patterns (N Y) 2021, 2, 100322. [Google Scholar] [CrossRef]
Tule S, Foley G, Boden M Do protein language models learn phylogeny? Brief Bioinform 2024, 26.
Vieira MC, Palm AE, Stamper CT, et al. Germline-encoded specificities and the predictability of the B cell response. PLoS Pathog 2023, 19, e1011603. [Google Scholar]
Javdan SB, Deans TL Design and development of engineered receptors for cell and tissue engineering. Curr Opin Syst Biol 2021, 28.
Wong WW, Lim WA Golden age of immunoengineering. Immunol Rev 2023, 320, 4–9. [CrossRef]
Rees AR Understanding the human antibody repertoire. MAbs 2020, 12, 1729683. [CrossRef]
Villanueva-Flores F, Sanchez-Villamil JI, Garcia-Atutxa I Publisher Correction: AI-driven epitope prediction: a systematic review, comparative analysis, and practical guide for vaccine development. NPJ Vaccines 2025, 10, 209. [CrossRef] [PubMed]
Elfatimi E, Lekbach Y, Prakash S, et al. Artificial intelligence and machine learning in the development of vaccines and immunotherapeutics-yesterday, today, and tomorrow. Front Artif Intell 2025, 8, 1620572. [Google Scholar] [CrossRef]
Shugay M, Bagaev DV, Zvyagin IV, et al. VDJdb: a curated database of T-cell receptor sequences with known antigen specificity. Nucleic Acids Res 2018, 46, D419–D427. [Google Scholar] [CrossRef]
Zhang W, Wang L, Liu K, et al. PIRD: Pan Immune Repertoire Database. Bioinformatics 2020, 36, 897–903. [Google Scholar] [CrossRef] [PubMed]
Textor J, Buytenhuijs F, Rogers D, et al. Machine learning analysis of the T cell receptor repertoire identifies sequence features of self-reactivity. Cell Syst 2023, 14, 1059–1073. [Google Scholar] [CrossRef]
Greenshields-Watson A, Abanades B, Deane CM Investigating the ability of deep learning-based structure prediction to extrapolate and/or enrich the set of antibody CDR canonical forms. Front Immunol 2024, 15, 1352703. [CrossRef]
Ostrovsky-Berman M, Frankel B, Polak P, et al. Immune2vec: Embedding B/T Cell Receptor Sequences in R (N) Using Natural Language Processing. Front Immunol 2021, 12, 680687. [Google Scholar] [CrossRef]
Sidhom JW, Larman HB, Pardoll DM, et al. DeepTCR is a deep learning framework for revealing sequence concepts within T-cell repertoires. Nat Commun 2021, 12, 1605. [Google Scholar] [CrossRef] [PubMed]
Li L, Peng X, Batliwala M, et al. Crystal structures of MHC class I complexes reveal the elusive intermediate conformations explored during peptide editing. Nat Commun 2023, 14, 5020. [Google Scholar] [CrossRef]
Sumida KH, Nunez-Franco R, Kalvet I, et al. Improving Protein Expression, Stability, and Function with ProteinMPNN. J Am Chem Soc 2024, 146, 2054–2061. [Google Scholar] [CrossRef]
Meng F, Zhou N, Hu G, et al. A comprehensive overview of recent advances in generative models for antibodies. Comput Struct Biotechnol J 2024, 23, 2648–2660. [Google Scholar] [CrossRef] [PubMed]
Saadat M, Zare-Mirakabad F, Masoudi-Nejad A, et al. HLAPepBinder: An Ensemble Model for The Prediction Of HLA-Peptide Binding. Iran J Biotechnol 2024, 22, e3927. [Google Scholar]
Gioia D, Bertazzo M, Recanatini M, et al. Dynamic Docking: A Paradigm Shift in Computational Drug Discovery. Molecules 2017, 22. [Google Scholar] [CrossRef]
Guner Yilmaz OZ, Doruker P, Kurkcuoglu O A Computationally Efficient Method to Generate Plausible Conformers for Ensemble Docking and Binding Free Energy Calculations. J Chem Inf Model 2025, 65, 8137–8157. [CrossRef]
Bielska W, Jaszczyszyn I, Dudzic P, et al. Applying computational protein design to therapeutic antibody discovery - current state and perspectives. Front Immunol 2025, 16, 1571371. [Google Scholar] [CrossRef]
Sverchkov Y, Craven M A review of active learning approaches to experimental design for uncovering biological networks. PLoS Comput Biol 2017, 13, e1005466.
Slavny P, Hegde M, Doerner A, et al. Advancements in mammalian display technology for therapeutic antibody development and beyond: current landscape, challenges, and future prospects. Front Immunol 2024, 15, 1469329. [Google Scholar] [CrossRef]
Ching T, Himmelstein DS, Beaulieu-Jones BK, et al. Opportunities and obstacles for deep learning in biology and medicine. J R Soc Interface 2018, 15. [Google Scholar]
Wossnig L, Furtmann N, Buchanan A, et al. Best practices for machine learning in antibody discovery and development. Drug Discov Today 2024, 29, 104025. [Google Scholar] [CrossRef] [PubMed]
Rajagopal N, Choudhary U, Tsang K, et al. Deep learning-based design and experimental validation of a medicine-like human antibody library. Brief Bioinform 2024, 26. [Google Scholar]
Dixon T, MacPherson D, Mostofian B, et al. Predicting the structural basis of targeted protein degradation by integrating molecular dynamics simulations with structural mass spectrometry. Nat Commun 2022, 13, 5884. [Google Scholar] [CrossRef] [PubMed]
Hati S, Bhattacharyya S Incorporating modeling and simulations in undergraduate biophysical chemistry course to promote understanding of structure-dynamics-function relationships in proteins. Biochem Mol Biol Educ 2016, 44, 140–159. [CrossRef]
Hamamsy T, Barot M, Morton JT, et al. Learning sequence, structure, and function representations of proteins with language models. bioRxiv 2023.
Dibaeinia P, Ojha A, Sinha S Interpretable AI for inference of causal molecular relationships from omics data. Sci Adv 2025, 11, eadk0837. [CrossRef] [PubMed]
Davidsen K, Olson BJ, DeWitt WS, 3rd, et al. Deep generative models for T cell receptor protein sequences. Elife 2019, 8. [Google Scholar]
Isacchini G, Sethna Z, Elhanati Y, et al. Generative models of T-cell receptor sequences. Phys Rev E 2020, 101, 062414. [Google Scholar] [CrossRef]
Tang X, Dai H, Knight E, et al. A survey of generative AI for de novo drug design: new frontiers in molecule and protein generation. Brief Bioinform 2024, 25. [Google Scholar]
Borrman T, Pierce BG, Vreven T, et al. High-throughput modeling and scoring of TCR-pMHC complexes to predict cross-reactive peptides. Bioinformatics 2021, 36, 5377–5385. [Google Scholar] [CrossRef]
Seo SY, Rhee JK TCR-epiDiff: solving dual challenges of TCR generation and binding prediction. Bioinformatics 2025, 41, i125–i132. [CrossRef]
Kalemati M, Noroozi A, Shahbakhsh A, et al. ParaAntiProt provides paratope prediction using antibody and protein language models. Sci Rep 2024, 14, 29141. [Google Scholar] [CrossRef] [PubMed]
Ghanbarpour A, Jiang M, Foster D, et al. Structure-free antibody paratope similarity prediction for in silico epitope binning via protein language models. iScience 2023, 26, 106036. [Google Scholar] [CrossRef]
Pruvost T, Mathieu M, Dubois S, et al. Deciphering cross-species reactivity of LAMP-1 antibodies using deep mutational epitope mapping and AlphaFold. MAbs 2023, 15, 2175311. [Google Scholar] [CrossRef] [PubMed]
Mason DM, Reddy ST Predicting adaptive immune receptor specificities by machine learning is a data generation problem. Cell Syst 2024, 15, 1190–1197. [CrossRef]
Abeer A, Urban NM, Weil MR, et al. Multi-objective latent space optimization of generative molecular design models. Patterns (N Y) 2024, 5, 101042. [Google Scholar] [CrossRef] [PubMed]
Lindner SE, Johnson SM, Brown CE, et al. Chimeric antigen receptor signaling: Functional consequences and design implications. Sci Adv 2020, 6, eaaz3223. [Google Scholar] [CrossRef] [PubMed]
Shahzadi M, Rafique H, Waheed A, et al. Artificial intelligence for chimeric antigen receptor-based therapies: a comprehensive review of current applications and future perspectives. Ther Adv Vaccines Immunother 2024, 12, 25151355241305856. [Google Scholar] [CrossRef] [PubMed]
Guedan S, Calderon H, Posey AD, Jr. , et al. Engineering and Design of Chimeric Antigen Receptors. Mol Ther Methods Clin Dev 2019, 12, 145–156. [Google Scholar] [CrossRef]
Castellanos-Rueda R, Wang KK, Forster JL, et al. Dissecting the role of CAR signaling architectures on T cell activation and persistence using pooled screens and single-cell sequencing. Sci Adv 2025, 11, eadp4008. [Google Scholar] [CrossRef]
Smirnov S, Mateikovich P, Samochernykh K, et al. Recent advances on CAR-T signaling pave the way for prolonged persistence and new modalities in clinic. Front Immunol 2024, 15, 1335424. [Google Scholar] [CrossRef]
Alsaieedi AA, Zaher KA Tracing the development of CAR-T cell design: from concept to next-generation platforms. Front Immunol 2025, 16, 1615212. [CrossRef]
Sutanto HF, D. Integrating artificial intelligence into small molecule development for precision cancer immunomodulation therapy. npj Drug Discovery 2025, 2. [Google Scholar] [CrossRef]
Du Q, Wang H, Jiang B, et al. Advancing genetic engineering with active learning: theory, implementations and potential opportunities. Brief Bioinform 2025, 26. [Google Scholar]
Ferdous S, Shihab IF, Chowdhury R, et al. Reinforcement learning-guided control strategies for CAR T-cell activation and expansion. Biotechnol Bioeng 2024, 121, 2868–2880. [Google Scholar] [CrossRef]
Yeku OO, Brentjens RJ Armored CAR T-cells: utilizing cytokines and pro-inflammatory ligands to enhance CAR T-cell anti-tumour efficacy. Biochem Soc Trans 2016, 44, 412–418. [CrossRef]
Li X, Chen T, Li X, et al. Therapeutic targets of armored chimeric antigen receptor T cells navigating the tumor microenvironment. Exp Hematol Oncol 2024, 13, 96. [Google Scholar] [CrossRef]
Han X, Wang Y, Wei J, et al. Multi-antigen-targeted chimeric antigen receptor T cells for cancer therapy. J Hematol Oncol 2019, 12, 128. [Google Scholar] [CrossRef]
Zitnik M, Li MM, Wells A, et al. Current and future directions in network biology. Bioinform Adv 2024, 4, vbae099. [Google Scholar]
Blay V, Pandiella A Strategies to boost antibody selectivity in oncology. Trends Pharmacol Sci 2024, 45, 1135–1149. [CrossRef]
This S, Costantino S, Melichar HJ Machine learning predictions of T cell antigen specificity from intracellular calcium dynamics. Sci Adv 2024, 10, eadk2298. [CrossRef]
Putignano G, Ruiperez-Campillo S, Yuan Z, et al. Mathematical models and computational approaches in CAR-T therapeutics. Front Immunol 2025, 16, 1581210. [Google Scholar] [CrossRef] [PubMed]
McFaline-Figueroa JL, Srivatsan S, Hill AJ, et al. Multiplex single-cell chemical genomics reveals the kinase dependence of the response to targeted therapy. Cell Genom 2024, 4, 100487. [Google Scholar] [CrossRef]
Biederstadt A, Manzar GS, Daher M Multiplexed engineering and precision gene editing in cellular immunotherapy. Front Immunol 2022, 13, 1063303. [CrossRef]
Zambrano N, Froechlich G, Lazarevic D, et al. High-Throughput Monoclonal Antibody Discovery from Phage Libraries: Challenging the Current Preclinical Pipeline to Keep the Pace with the Increasing mAb Demand. Cancers (Basel) 2022, 14. [Google Scholar]
Lopez R, Regier J, Cole MB, et al. Deep generative modeling for single-cell transcriptomics. Nat Methods 2018, 15, 1053–1058. [Google Scholar] [CrossRef]
Specht H, Emmott E, Petelski AA, et al. Single-cell proteomic and transcriptomic analysis of macrophage heterogeneity using SCoPE2. Genome Biol 2021, 22, 50. [Google Scholar]
Vijayan RSK, Kihlberg J, Cross JB, et al. Enhancing preclinical drug discovery with artificial intelligence. Drug Discov Today 2022, 27, 967–984. [Google Scholar] [CrossRef]
Audagnotto M, Czechtizky W, De Maria L, et al. Machine learning/molecular dynamic protein structure prediction approach to investigate the protein conformational ensemble. Sci Rep 2022, 12, 10018. [Google Scholar] [CrossRef] [PubMed]
Chen L, Balabanidou V, Remeta DP, et al. Structural instability tuning as a regulatory mechanism in protein-protein interactions. Mol Cell 2011, 44, 734–744. [Google Scholar] [CrossRef]
Hartung T AI, agentic models and lab automation for scientific discovery - the beginning of scAInce. Front Artif Intell 2025, 8, 1649155. [CrossRef]
Irvine DJ, Maus MV, Mooney DJ, et al. The future of engineered immune cell therapies. Science 2022, 378, 853–858. [Google Scholar] [CrossRef]
Wang W, Hariharan M, Ding W, et al. Genetics and Environment Distinctively Shape the Human Immune Cell Epigenome. bioRxiv. 2025. [Google Scholar]
Kondilis-Mangum HD, Wade PA Epigenetics and the adaptive immune response. Mol Aspects Med 2013, 34, 813–825. [CrossRef]
Sadria M, Layton A scVAEDer: integrating deep diffusion models and variational autoencoders for single-cell transcriptomics analysis. Genome Biol 2025, 26, 64.
Jung S Advances in modeling cellular state dynamics: integrating omics data and predictive techniques. Anim Cells Syst (Seoul) 2025, 29, 72–83. [CrossRef]
Rodov A, Baniadam H, Zeiser R, et al. Towards the Next Generation of Data-Driven Therapeutics Using Spatially Resolved Single-Cell Technologies and Generative AI. Eur J Immunol 2025, 55, e202451234. [Google Scholar] [CrossRef]
Ding J, Regev A Deep generative model embedding of single-cell RNA-Seq profiles on hyperspheres and hyperbolic spaces. Nat Commun 2021, 12, 2554. [CrossRef] [PubMed]
Choi H, Kim H, Chung H, et al. Application of computational algorithms for single-cell RNA-seq and ATAC-seq in neurodegenerative diseases. Brief Funct Genomics 2025, 24. [Google Scholar]
Wang C, Liu ZP Diffusion-based generation of gene regulatory networks from scRNA-seq data with DigNet. Genome Res 2025, 35, 340–354.
Yeo GHT, Saksena SD, Gifford DK Generative modeling of single-cell time series with PRESCIENT enables prediction of cell trajectories with interventions. Nat Commun 2021, 12, 3222. [CrossRef]
Weerakoon H, Mohamed A, Wong Y, et al. Integrative temporal multi-omics reveals uncoupling of transcriptome and proteome during human T cell activation. NPJ Syst Biol Appl 2024, 10, 21. [Google Scholar]
Wang X, Fan D, Yang Y, et al. Integrative multi-omics approaches to explore immune cell functions: Challenges and opportunities. iScience 2023, 26, 106359. [Google Scholar] [CrossRef] [PubMed]
Lotfollahi M, Klimovskaia Susmelj A, De Donno C, et al. Predicting cellular responses to complex perturbations in high-throughput screens. Mol Syst Biol 2023, 19, e11517. [Google Scholar]
Lotfollahi M, Wolf FA, Theis FJ scGen predicts single-cell perturbation responses. Nat Methods 2019, 16, 715–721. [CrossRef]
Metzner E, Southard KM, Norman TM Multiome Perturb-seq unlocks scalable discovery of integrated perturbation effects on the transcriptome and epigenome. Cell Syst 2025, 16, 101161. [CrossRef]
Zhou P, Shi H, Huang H, et al. Single-cell CRISPR screens in vivo map T cell fate regulomes in cancer. Nature 2023, 624, 154–163. [Google Scholar] [CrossRef]
Belk JA, Yao W, Ly N, et al. Genome-wide CRISPR screens of T cell exhaustion identify chromatin remodeling factors that limit T cell persistence. Cancer Cell 2022, 40, 768–786. [Google Scholar] [CrossRef] [PubMed]
Xia L, Komissarova A, Jacover A, et al. Systematic identification of gene combinations to target in innate immune cells to enhance T cell activation. Nat Commun 2023, 14, 6295. [Google Scholar] [CrossRef] [PubMed]
Prochazka L, Michaels YS, Lau C, et al. Synthetic gene circuits for cell state detection and protein tuning in human pluripotent stem cells. Mol Syst Biol 2022, 18, e10886. [Google Scholar]
Sole R, Conde-Pueyo N, Pla-Mauri J, et al. Open problems in synthetic multicellularity. NPJ Syst Biol Appl 2024, 10, 151. [Google Scholar]
Nilsson A, Peters JM, Meimetis N, et al. Artificial neural networks enable genome-scale simulations of intracellular signaling. Nat Commun 2022, 13, 3069. [Google Scholar] [CrossRef]
Racovita A, Jaramillo A Reinforcement learning in synthetic gene circuits. Biochem Soc Trans 2020, 48, 1637–1643. [CrossRef]
Ding X, Zhang L, Fan M, et al. Network-based transfer of pan-cancer immunotherapy responses to guide breast cancer prognosis. NPJ Syst Biol Appl 2025, 11, 4. [Google Scholar]
DaSilva LF, Senan S, Patel ZM, et al. DNA-Diffusion: Leveraging Generative Models for Controlling Chromatin Accessibility and Gene Expression via Synthetic Regulatory Elements. bioRxiv 2024.
Garcia BT, Westerfield L, Yelemali P, et al. Improving automated deep phenotyping through large language models using retrieval-augmented generation. Genome Med 2025, 17, 91. [Google Scholar] [CrossRef]
Dong M, Wang L, Hu N, et al. Integration of multi-omics approaches in exploring intra-tumoral heterogeneity. Cancer Cell Int 2025, 25, 317. [Google Scholar] [CrossRef]
van Lent P, Schmitz J, Abeel T Simulated Design-Build-Test-Learn Cycles for Consistent Comparison of Machine Learning Methods in Metabolic Engineering. ACS Synth Biol 2023, 12, 2588–2599. [CrossRef]
Kitano S, Lin C, Foo JL, et al. Synthetic biology: Learning the way toward high-precision biological design. PLoS Biol 2023, 21, e3002116. [Google Scholar]
Daniszewski M, Crombie DE, Henderson R, et al. Automated Cell Culture Systems and Their Applications to Human Pluripotent Stem Cell Studies. SLAS Technol 2018, 23, 315–325. [Google Scholar] [CrossRef]
Lei Y, Tsang JS Systems Human Immunology and AI: Immune Setpoint and Immune Health. Annu Rev Immunol 2025, 43, 693–722. [CrossRef] [PubMed]
Gurdo N, Volke DC, McCloskey D, et al. Automating the design-build-test-learn cycle towards next-generation bacterial cell factories. N Biotechnol 2023, 74, 1–15. [Google Scholar] [CrossRef] [PubMed]
Si Y, Zou J, Gao Y, et al. Foundation models in molecular biology. Biophys Rep 2024, 10, 135–151. [Google Scholar]
Moldwin A, Shehu A (2025) Foundation Models for AI-Enabled Biological Design. arxiv 2025.
Al-Jumaily A, Mukaidaisi M, Vu A, et al. Examining multi-objective deep reinforcement learning frameworks for molecular design. Biosystems 2023, 232, 104989. [Google Scholar] [CrossRef]
Wang J, Zhu F Multi-objective molecular generation via clustered Pareto-based reinforcement learning. Neural Netw 2024, 179, 106596. [CrossRef]
Wytock TP, Motter AE Cell reprogramming design by transfer learning of functional transcriptional networks. Proc Natl Acad Sci U S A 2024, 121, e2312942121. [CrossRef] [PubMed]
Zrimec J, Fu X, Muhammad AS, et al. Controlling gene expression with deep generative design of regulatory DNA. Nat Commun 2022, 13, 5099. [Google Scholar] [CrossRef]
Niarakis A, Laubenbacher R, An G, et al. Immune digital twins for complex human pathologies: applications, limitations, and challenges. NPJ Syst Biol Appl 2024, 10, 141. [Google Scholar]
Blaby IK, Cheng JF Building a custom high-throughput platform at the Joint Genome Institute for DNA construct design and assembly-present and future challenges. Synth Biol (Oxf) 2020, 5, ysaa023.
Ma Y, Zhang Z, Jia B, et al. Automated high-throughput DNA synthesis and assembly. Heliyon 2024, 10, e26967. [Google Scholar] [CrossRef] [PubMed]
Rezalotfi A, Fritz L, Forster R, et al. Challenges of CRISPR-Based Gene Editing in Primary T Cells. Int J Mol Sci 2022, 23. [Google Scholar]
Kwak S, Lee H, Yu D, et al. Microfluidic Platforms for Ex Vivo and In Vivo Gene Therapy. Biosensors (Basel) 2025, 15. [Google Scholar]
Hunt AC, Rasor BJ, Seki K, et al. Cell-Free Gene Expression: Methods and Applications. Chem Rev 2025, 125, 91–149. [Google Scholar] [CrossRef] [PubMed]
Lu Y The future of cell-free synthetic biology. Biotechnol Notes 2024, 5, A1–A3. [CrossRef]
Craig T, Holland R, D’Amore R, et al. Leaf LIMS: A Flexible Laboratory Information Management System with a Synthetic Biology Focus. ACS Synth Biol 2017, 6, 2273–2280. [Google Scholar] [CrossRef]
Zhou Y, Shao N, Bessa de Castro R, et al. Evaluation of Single-Cell Cytokine Secretion and Cell-Cell Interactions with a Hierarchical Loading Microwell Chip. Cell Rep 2020, 31, 107574. [Google Scholar] [CrossRef]
Lim J, Park C, Kim M, et al. Advances in single-cell omics and multiomics for high-resolution molecular profiling. Exp Mol Med 2024, 56, 515–526. [Google Scholar] [CrossRef]
Roth JP, Bajorath J Relationship between prediction accuracy and uncertainty in compound potency prediction using deep neural networks and control models. Sci Rep 2024, 14, 6536. [CrossRef] [PubMed]
Matzko R, Konur S Technologies for design-build-test-learn automation and computational modelling across the synthetic biology workflow: a review. Network Modeling Analysis in Health Informatics and Bioinformatics 2024, 13.
Goshisht MK Machine Learning and Deep Learning in Synthetic Biology: Key Architectures, Applications, and Challenges. ACS Omega 2024, 9, 9921–9945. [CrossRef]
Holowko MB, Frow EK, Reid JC, et al. Building a biofoundry. Synth Biol (Oxf) 2021, 6, ysaa026. [Google Scholar]
Bultelle M, Casas A, Kitney R Engineering biology and automation-Replicability as a design principle. Eng Biol 2024, 8, 53–68. [CrossRef]
Emmert-Streib F, Yli-Harja O What Is a Digital Twin? Experimental Design for a Data-Centric Machine Learning Perspective in Health. Int J Mol Sci 2022, 23.
McLaughlin JA, Beal J, Misirli G, et al. The Synthetic Biology Open Language (SBOL) Version 3, Simplified Data Exchange for Bioengineering. Front Bioeng Biotechnol 2020, 8, 1009. [Google Scholar]
Gierend K, Kruger F, Genehr S, et al. Provenance Information for Biomedical Data and Workflows: Scoping Review. J Med Internet Res 2024, 26, e51297. [Google Scholar] [CrossRef]
Singh N, Lane S, Yu T, et al. A generalized platform for artificial intelligence-powered autonomous enzyme engineering. Nat Commun 2025, 16, 5648. [Google Scholar] [CrossRef]
Rapp JT, Bremer BJ, Romero PA Self-driving laboratories to autonomously navigate the protein fitness landscape. Nat Chem Eng 2024, 1, 97–107. [CrossRef]
Martin HG, Radivojevic T, Zucker J, et al. Perspectives for self-driving labs in synthetic biology. Curr Opin Biotechnol 2023, 79, 102881. [Google Scholar]
Melocchi A, Schmittlein B, Sadhu S, et al. Automated manufacturing of cell therapies. J Control Release 2025, 381, 113561. [Google Scholar] [CrossRef] [PubMed]
Wang B, Chen RQ, Li J, et al. Interfacing data science with cell therapy manufacturing: where we are and where we need to be. Cytotherapy 2024, 26, 967–979. [Google Scholar] [CrossRef] [PubMed]
Cai Y, Chen R, Gao S, et al. Artificial intelligence applied in neoantigen identification facilitates personalized cancer immunotherapy. Front Oncol 2022, 12, 1054231. [Google Scholar]
Zou H, Liu W, Wang X, et al. Dynamic monitoring of circulating tumor DNA reveals outcomes and genomic alterations in patients with relapsed or refractory large B-cell lymphoma undergoing CAR T-cell therapy. J Immunother Cancer 2024, 12. [Google Scholar]
Bottini M, Ryu SJ, Terander AE, et al. The Ever-Evolving Regulatory Landscape Concerning Development and Clinical Application of Machine Intelligence: Practical Consequences for Spine Artificial Intelligence Research. Neurospine 2025, 22, 134–143. [Google Scholar] [CrossRef]
Hersh W, Fultz Hollis K Results and implications for generative AI in a large introductory biomedical and health informatics course. NPJ Digit Med 2024, 7, 247. [CrossRef]
Chao R, Mishra S, Si T, et al. Engineering biological systems using automated biofoundries. Metab Eng 2017, 42, 98–108. [Google Scholar] [CrossRef]
Nettleton DF, Mari-Buye N, Marti-Soler H, et al. Smart Sensor Control and Monitoring of an Automated Cell Expansion Process. Sensors (Basel) 2023, 23. [Google Scholar]
Cheng F, Xie W, Zheng H (2024) Digital Twin Calibration for Biological System-of-Systems: Cell Culture Manufacturing Process. arxiv 2024.
Hie BL, Shanker VR, Xu D, et al. Efficient evolution of human antibodies from general protein language models. Nat Biotechnol 2024, 42, 275–283. [Google Scholar] [CrossRef]
Jonny, Sitepu EC, Nidom CA, et al. Ex Vivo-Generated Tolerogenic Dendritic Cells: Hope for a Definitive Therapy of Autoimmune Diseases. Curr Issues Mol Biol 2024, 46, 4035–4048. [Google Scholar] [CrossRef]
Stucchi A, Maspes F, Montee-Rodrigues E, et al. Engineered Treg cells: The heir to the throne of immunotherapy. J Autoimmun 2024, 144, 102986. [Google Scholar] [CrossRef]
Wang K, Song B, Zhu Y, et al. Peripheral nerve-derived CSF1 induces BMP2 expression in macrophages to promote nerve regeneration and wound healing. NPJ Regen Med 2024, 9, 35. [Google Scholar] [CrossRef] [PubMed]
O’Brien H, Salm M, Morton LT, et al. A modular protein language modelling approach to immunogenicity prediction. PLoS Comput Biol 2024, 20, e1012511. [Google Scholar]
Alowidi N, Ali R, Sadaqah M, et al. Advancing Kidney Transplantation: A Machine Learning Approach to Enhance Donor-Recipient Matching. Diagnostics (Basel) 2024, 14. [Google Scholar]
Weimer ET, Newhall KA Machine learning enhanced immunologic risk assessments for solid organ transplantation. Sci Rep 2025, 15, 7943. [CrossRef]
Alsaafeen BH, Ali BR, Elkord E Combinational therapeutic strategies to overcome resistance to immune checkpoint inhibitors. Front Immunol 2025, 16, 1546717. [CrossRef]
Liu Z, Zhang J, Hong L, et al. Multiscale mathematical model-informed reinforcement learning optimizes combination treatment scheduling in glioblastoma evolution. Sci Adv 2025, 11, eadv3316. [Google Scholar] [CrossRef] [PubMed]
Prelaj A, Miskovic V, Zanitti M, et al. Artificial intelligence for predictive biomarker discovery in immuno-oncology: a systematic review. Ann Oncol 2024, 35, 29–65. [Google Scholar] [CrossRef]
Ahlquist KD, Sugden LA, Ramachandran S Enabling interpretable machine learning for biological data with reliability scores. PLoS Comput Biol 2023, 19, e1011175.
Shiferaw KB, Roloff M, Balaur I, et al. Guidelines and standard frameworks for artificial intelligence in medicine: a systematic review. JAMIA Open 2025, 8, ooae155. [Google Scholar]
Vithlani J, Hawksworth C, Elvidge J, et al. Economic evaluations of artificial intelligence-based healthcare interventions: a systematic literature review of best practices in their conduct and reporting. Front Pharmacol 2023, 14, 1220950. [Google Scholar] [CrossRef]
Pannu J, Bloomfield D, MacKnight R, et al. Dual-use capabilities of concern of biological AI models. PLoS Comput Biol 2025, 21, e1012975. [Google Scholar]
Bloomfield D, Pannu J, Zhu AW, et al. AI and biosecurity: The need for governance. Science 2024, 385, 831–833. [Google Scholar] [CrossRef]
Derraz B, Breda G, Kaempf C, et al. New regulatory thinking is needed for AI-based personalised drug and cell therapies in precision oncology. NPJ Precis Oncol 2024, 8, 23. [Google Scholar] [CrossRef]
Groff-Vindman CS, Trump BD, Cummings CL, et al. The convergence of AI and synthetic biology: the looming deluge. npj Biomedical Innovations 2025, 2. [Google Scholar] [CrossRef]
Undheim TA The whack-a-mole governance challenge for AI-enabled synthetic biology: literature review and emerging frameworks. Front Bioeng Biotechnol 2024, 12, 1359768.
Wu J, Koelzer VH Towards generative digital twins in biomedical research. Comput Struct Biotechnol J 2024, 23, 3481–3488. [CrossRef] [PubMed]
Liu S, Guo LR Data Ownership in the AI-Powered Integrative Health Care Landscape. JMIR Med Inform 2024, 12, e57754. [CrossRef]
Welzel C, Ostermann M, Smith HL, et al. Enabling secure and self determined health data sharing and consent management. NPJ Digit Med 2025, 8, 560. [Google Scholar] [CrossRef] [PubMed]
Balsa-Canto E, Campo-Manzanares N, Moimenta AR, et al. Quantifying and managing uncertainty in systems biology: Mechanistic and data-driven models. Current Opinion in Systems Biology 2025, 42. [Google Scholar] [CrossRef]
Zhou Z, Hu M, Salcedo M, et al. (2023) XAI meets Biology: A Comprehensive Review of Explainable AI in Bioinformatics Applications. arxiv 2023.
Singh R, Paxton M, Auclair J Regulating the AI-enabled ecosystem for human therapeutics. Commun Med (Lond) 2025, 5, 181.
Niazi SK Regulatory Perspectives for AI/ML Implementation in Pharmaceutical GMP Environments. Pharmaceuticals (Basel) 2025, 18.
de Lima RC, Quaresma JAS Emerging technologies transforming the future of global biosecurity. Front Digit Health 2025, 7, 1622123. [CrossRef] [PubMed]
Kumar D, Malin BA, Vishwanatha JK, et al. AI in Biomedicine-A Forward-Looking Perspective on Health Equity. Int J Environ Res Public Health 2024, 21. [Google Scholar]
Johnson KB, Horn IB, Horvitz E Pursuing Equity With Artificial Intelligence in Health Care. JAMA Health Forum 2025, 6, e245031. [CrossRef] [PubMed]
Krenn M, Pollice R, Guo SY, et al. On scientific understanding with artificial intelligence. Nat Rev Phys 2022, 4, 761–769. [Google Scholar] [CrossRef]
Alvarado R AI as an Epistemic Technology. Sci Eng Ethics 2023, 29, 32. [CrossRef] [PubMed]

Figure 1. Foundations of Generative Immunoengineering.

Figure 2. The closed-loop DBTL pipeline for generative immunoengineering.

Figure 3. Governance Framework for Generative Immunoengineering From Technical Foundations to Societal Oversight.

Table 1. From Conventional Immunotherapy to Generative Immunoengineering.

Dimension	Conventional Immunotherapy Paradigm	Generative Immunoengineering Paradigm
Design Logic	Empirical discovery based on trial-and-error antigen selection and screening.	Computational generation guided by probabilistic models of receptor–antigen interaction and cellular behavior.
Data Utilization	Limited to experimental assays and patient-level outcomes.	Integrates multi-omic, structural, and clinical data into unified embeddings for design and prediction.
Optimization Process	Manual iterative testing; low throughput; human-driven decision cycles.	Automated optimization via active learning, reinforcement signals, and adaptive design–build–test–learn (DBTL) loops.
Scope of Design	Focused on single-target molecules or cell products.	System-level generation of receptors, circuits, and cellular phenotypes across multiple design objectives.
Experimental Feedback	Linear workflow: hypothesis → test → validation.	Closed feedback loop: generative output → experimental validation → model retraining → improved design.
Time Scale	Months to years from concept to candidate validation.	Days to weeks with parallel computational synthesis and in-silico pre-screening.
Interpretability	Dependent on mechanistic intuition; limited transparency in design rationale.	Explainable AI and mechanistic priors enable causal insight alongside prediction.
Manufacturing Integration	Separated from design; manual scale-up.	Digitally linked to automated biofoundries with algorithmic quality control and digital-twin monitoring.
Regulatory Context	Static approval for fixed molecular entities.	Continuous validation and lifecycle oversight for adaptive, learning-based therapeutics.
Ethical Dimension	Reactive regulation; limited data transparency.	Embedded ethical governance: consent tracking, data provenance, and algorithmic accountability.

Table 2. Translational Matrix: Applications and Targets of Generative Immunoengineering.

Therapeutic Domain	Generative Design Strategy	Example Construct / Approach	Translational Stage	Clinical or Strategic Goal	References
Oncology	AI-driven receptor optimization for high-affinity, tumor-specific binding.	Multispecific or logic-gated CAR-T cells integrating affinity tuning and cytokine-controlled feedback.	Preclinical → Early-phase clinical trials.	Increase tumor selectivity, persistence, and safety; mitigate off-tumor toxicity.	[96,187]
Autoimmune Disorders	Generative modeling of regulatory-cell circuits to restore immune tolerance.	Synthetic regulatory T (Treg) or tolerogenic dendritic-cell designs with cytokine-balanced circuits.	Preclinical / proof-of-concept.	Suppress autoimmunity without broad immunosuppression.	[188,189]
Regenerative Medicine	AI-guided macrophage or APC reprogramming for tissue repair and graft tolerance.	Engineered macrophages producing pro-resolving mediators and metabolic-stability signatures.	Early preclinical studies.	Enhance regeneration, reduce fibrosis, and improve graft acceptance.	[190]
Infectious Disease & Vaccinology	Rapid generative design of receptor and antigen pairs using diffusion or language models.	Model-driven epitope design for next-generation vaccine candidates.	Preclinical / candidate identification.	Accelerate immune-response optimization to emerging pathogens.	[187,191]
Transplant Immunology	Generative modeling for donor–recipient immune matching and synthetic tolerance induction.	Predictive receptor generation minimizing alloreactivity; tolerance-inducing T-cell circuits.	Conceptual / exploratory.	Prevent graft rejection and chronic inflammation.	[192,193]
Immuno-Oncology Combinations	System-level optimization of multi-agent intervention.	Co-designed CAR-T + checkpoint-modulator constructs governed by reinforcement learning.	Translational modeling / phase I design.	Harmonize multi-modal immunotherapy dynamics.	[194,195]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

AI-Enabled Generative Design of Immune Cells and Receptors for Programmable Immunity

Abstract

Keywords:

Subject:

1. Introduction

2. The Generative Turn in Immunoengineering

3. Foundations of Generative Biology for Immune Systems

4. Learning Immune Specificity from Repertoires and Structures

5. AI-Driven Design of Immune Receptors and Constructs

5.1. Designing Antigen-Specific Receptors

5.2. Modular Optimization of Chimeric Antigen Receptors

5.3. Engineering Logic-Gated and Multiplexed Architectures

5.4. Integrating Computational Design with Experimental Validation

6. Programming Cell Phenotypes with Generative Models

6.1. Modeling the Cellular State Space

6.2. Generative Reprogramming and Perturbation Modeling

6.3. Linking Generative Models to Synthetic Circuits

6.4. Toward Closed-Loop Phenotype Design

7. The Design–Build–Test–Learn Loop at Scale

7.1. The Design Phase: Generative Hypothesis Formation

7.2. The Build Phase: Automated Synthesis and Cellular Integration

7.3. The Test Phase: High-Dimensional and Multiscale Evaluation

7.4. The Learn Phase: Model Updating and Active Reinforcement

7.5. Automation, Data Infrastructure, and Self-Optimizing Biofoundries

7.6. Integrative and Translational Implications

8. Translational Opportunities and Clinical Outlook

9. Ethical, Regulatory, and Societal Implications of Generative Immunoengineering

10. Conclusion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

MDPI Initiatives

Important Links

Subscribe