Submitted:
10 May 2025
Posted:
12 May 2025
Read the latest preprint version here
Abstract
Keywords:
1. Introduction to Caenorhabditis elegans as a Model Organism
2. Overview of Machine Learning
2.1. Types of Machine Learning
2.2. Types of Machine Learning Architecture
3. Machine Learning in C. elegans Developmental Research
3.1. Classification and Morphological Phenotyping of C. elegans
3.1.1. Classification of Developmental Stages
3.1.3. Physiological Age Estimation
3.1.4. Sexual Classification
3.1.5. Real-Time Tracking and Dynamic Phenotyping
3.2. Developmental Toxicity and Tissue Analysis in C. elegans
3.2.1. Developmental Toxicity Testing
3.2.1. Analysing Tissue Damage and Egg Viability
3.2.2. Tissue Morphological Transitions
3.2. Cellular Dynamics and Lineage Studies in C. elegans
3.2.1. Cell Lineage Tracing
3.3.2. Whole-Body Cell Segmentation and Recognition
3.3.3. Modelling Cellular Dynamics in Embryogenesis
3.3.4. Tracking Germline Stem Cell Dynamics in Embryos
3.3.5. Detection and Characterization of Multicellular Structures in Embryos
4. Future Perspectives and Limitations
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Abbreviations
| CNN | Convolutional Neural Network |
| SVM | Support Vector Machine |
| GAN | Generative Adversarial Network |
| PCA | Principal Component Analysis |
| t-SNE | t-distributed Stochastic Neighbor Embedding |
| MAE | Mean Absolute Error |
| DIC | Differential Interference Contrast |
| IoU | Intersection over Union |
| GFP | Green Fluorescent Protein |
| RFP | Red Fluorescent Protein |
| GUI | Graphical User Interface |
| ABM | Agent-Based Modeling |
| DVF | Displacement Vector Field |
| GSC | Germline Stem Cell |
| CTWF | Corrected Total Worm Fluorescence |
| RGB | Red Green Blue (color space) |
| FNN | Feedforward Neural Network |
| RNN | Recurrent Neural Network |
| AUC | Area Under the Curve |
| KNIME | Konstanz Information Miner |
| RCNN | Region-based Convolutional Neural Network |
| Mask-RCNN | Mask Region-based Convolutional Neural Network |
| SVM-DA | Support Vector Machine - Discriminant Analysis |
| DMapNet | Distance Map-based Network |
| ResNet | Residual Network |
| DIC | Differential Interference Contrast |
| 3D | Three-Dimensional |
| 2D | Two-Dimensional |
| 1GB | 1 Gigabyte |
| L-stage | Larval Stage |
References
- K. Ray et al., "A bioinformatics approach to elucidate conserved genes and pathways in C. elegans as an animal model for cardiovascular research," Sci Rep, vol. 14, no. 1, p. 7471. [CrossRef]
- Y. Azuma, H. Okada, and S. Onami, "Systematic analysis of cell morphodynamics in C. elegans early embryogenesis," Front Bioinform, vol. 3, p. 1082531, 2023. [CrossRef]
- J. S. Packer et al., "A lineage-resolved molecular atlas of C. elegans embryogenesis at single-cell resolution," Science, vol. 365, no. 6459, Sep 20 2019. [CrossRef]
- S. So, M. Asakawa, and H. Sawa, "Distinct functions of three Wnt proteins control mirror-symmetric organogenesis in the C. elegans gonad," Elife, vol. 13, Nov 1 2024. [CrossRef]
- R. Godini, H. Fallahi, and R. Pocock, "The regulatory landscape of neurite development in Caenorhabditis elegans," Front Mol Neurosci, vol. 15, p. 974208, 2022. [CrossRef]
- S. Zhang, F. Li, T. Zhou, G. Wang, and Z. Li, "Caenorhabditis elegans as a Useful Model for Studying Aging Mutations," Front Endocrinol (Lausanne), vol. 11, p. 554994, 2020. [CrossRef]
- Y. Li et al., "A full-body transcription factor expression atlas with completely resolved cell identities in C. elegans," Nat Commun, vol. 15, no. 1, p. 358, Jan 9 2024. [CrossRef]
- K. Corsi, B. Wightman, and M. Chalfie, "A Transparent Window into Biology: A Primer on Caenorhabditis elegans," Genetics, vol. 200, no. 2, pp. 387-407, Jun 2015. [CrossRef]
- L. P. O'Reilly, C. J. Luke, D. H. Perlmutter, G. A. Silverman, and S. C. Pak, "C. elegans in high-throughput drug discovery," Adv Drug Deliv Rev, vol. 69-70, pp. 247-53, Apr 2014. [CrossRef]
- H. Yuan et al., "Microfluidic-Assisted Caenorhabditis elegans Sorting: Current Status and Future Prospects," Cyborg Bionic Syst, vol. 4, p. 0011, 2023. [CrossRef]
- Q. An, S. Rahman, J. Zhou, and J. J. Kang, "A Comprehensive Review on Machine Learning in Healthcare Industry: Classification, Restrictions, Opportunities and Challenges," Sensors (Basel), vol. 23, no. 9, Apr 22 2023. [CrossRef]
- N. Buton, F. Coste, and Y. Le Cunff, "Predicting enzymatic function of protein sequences with attention," Bioinformatics, vol. 39, no. 10, Oct 3 2023. [CrossRef]
- E. T. Russo, F. Barone, A. Bateman, S. Cozzini, M. Punta, and A. Laio, "DPCfam: Unsupervised protein family classification by Density Peak Clustering of large sequence datasets," PLoS Comput Biol, vol. 18, no. 10, p. e1010610, Oct 2022. [CrossRef]
- R. Mourad, "Semi-supervised learning improves regulatory sequence prediction with unlabeled sequences," BMC Bioinformatics, vol. 24, no. 1, p. [CrossRef]
- R. Yang, L. Zhang, F. Bu, F. Sun, and B. Cheng, "AI-based prediction of protein-ligand binding affinity and discovery of potential natural product inhibitors against ERK2," BMC Chem, vol. 18, no. 1, p. 108, Jun 3 2024. [CrossRef]
- L. Zhang et al., "A deep learning model to identify gene expression level using cobinding transcription factor signals," Brief Bioinform, vol. 23, no. 1, Jan 17 2022. [CrossRef]
- M. Wang, Z. Wei, M. Jia, L. Chen, and H. Ji, "Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records," BMC Med Inform Decis Mak, vol. 22, no. 1, p. 41, Feb 16 2022. [CrossRef]
- M. Y. Anwar et al., "Machine learning-based clustering identifies obesity subgroups with differential multi-omics profiles and metabolic patterns," Obesity (Silver Spring), vol. 32, no. 11, pp. 2024-2034, Nov 2024. [CrossRef]
- J. L. Ballard, Z. Wang, W. Li, L. Shen, and Q. Long, "Deep learning-based approaches for multi-omics data integration and analysis," BioData Min, vol. 17, no. 1, p. 38, Oct 2 2024. [CrossRef]
- G. Adam, L. Rampasek, Z. Safikhani, P. Smirnov, B. Haibe-Kains, and A. Goldenberg, "Machine learning approaches to drug response prediction: challenges and recent progress," NPJ Precis Oncol, vol. 4, p. 19, 2020,. [CrossRef]
- Z. Guan, G. Z. Guan, G. Parmigiani, D. Braun, and L. Trippa, "Prediction of Hereditary Cancers Using Neural Networks," Ann Appl Stat, vol. 16, no. 1, pp. 2022. [Google Scholar] [CrossRef]
- B. Poirion, Z. Jing, K. Chaudhary, S. Huang, and L. X. Garmire, "DeepProg: an ensemble of deep-learning and machine-learning models for prognosis prediction using multi-omics data," Genome Med, vol. 13, no. 1, p. 112, Jul 14 2021. [CrossRef]
- F. Firat Atay et al., "A hybrid machine learning model combining association rule mining and classification algorithms to predict differentiated thyroid cancer recurrence," Front Med (Lausanne), vol. 11, p. 1461372, 2024. [CrossRef]
- R. Y. Choi, A. S. Coyner, J. Kalpathy-Cramer, M. F. Chiang, and J. P. Campbell, "Introduction to Machine Learning, Neural Networks, and Deep Learning," Transl Vis Sci Technol, vol. 9, no. 2, p. 14, Feb 27 2020. [CrossRef]
- U. Ravindran and C. Gunavathi, "Deep learning assisted cancer disease prediction from gene expression data using WT-GAN," BMC Med Inform Decis Mak, vol. 24, no. 1, p. 311, Oct 24 2024. [CrossRef]
- K. Shimasaki, Y. Okemoto-Nakamura, K. Saito, M. Fukasawa, K. Katoh, and K. Hanada, "Deep learning-based segmentation of subcellular organelles in high-resolution phase-contrast images," Cell Struct Funct, vol. 49, no. 2, pp. 57-65, Aug 30 2024. [CrossRef]
- S. M. Kandathil, A. M. Lau, and D. T. Jones, "Machine learning methods for predicting protein structure from single sequences," Curr Opin Struct Biol, vol. 81, p. 102627, Aug 2023. [CrossRef]
- R. Cao, C. Freitas, L. Chan, M. Sun, H. Jiang, and Z. Chen, "ProLanGO: Protein Function Prediction Using Neural Machine Translation Based on a Recurrent Neural Network," Molecules, vol. 22, no. 10, Oct 17 2017. [CrossRef]
- G. White et al., "Rapid and accurate developmental stage recognition of C. elegans from high-throughput image data," Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit, vol. 2010, no. 13-18 June 2010, pp. 3089-3096, Aug 5 2010. [CrossRef]
- P. Pan et al., "High-Resolution Imaging and Morphological Phenotyping of C. elegans through Stable Robotic Sample Rotation and Artificial Intelligence-Based 3-Dimensional Reconstruction," Research (Wash D C), vol. 7, p. 0513, 2024. [CrossRef]
- J. L. Lin, W. L. Kuo, Y. H. Huang, T. L. Jong, A. L. Hsu, and W. H. Hsu, "Using Convolutional Neural Networks to Measure the Physiological Age of Caenorhabditis elegans," IEEE/ACM Trans Comput Biol Bioinform, vol. 18, no. 6, pp. 2724-2732, Nov-Dec 2021. [CrossRef]
- T. Moore, J. M. Jordan, and L. R. Baugh, "WormSizer: high-throughput analysis of nematode size and shape," PLoS One, vol. 8, no. 2, p. e57142, 2013. [CrossRef]
- J. Schindelin et al., "Fiji: an open-source platform for biological-image analysis," Nat Methods, vol. 9, no. 7, pp. 676-82, Jun 28 2012. [CrossRef]
- S. K. Jung, B. Aleman-Meza, C. Riepe, and W. Zhong, "QuantWorm: a comprehensive software package for Caenorhabditis elegans phenotypic assays," PLoS One, vol. 9, no. 1, p. e84830, 2014. [CrossRef]
- Wahlby et al., "An image analysis toolbox for high-throughput C. elegans assays," Nat Methods, vol. 9, no. 7, pp. 714-6, Apr 22 2012. [CrossRef]
- Hakim et al., "WorMachine: machine learning-based phenotypic analysis tool for worms," BMC Biol, vol. 16, no. 1, p. 8, Jan 16 2018. [CrossRef]
- Z. Li et al., "A robotic system for automated genetic manipulation and analysis of Caenorhabditis elegans," PNAS Nexus, vol. 2, no. 7, p. pgad197, Jul 2023. [CrossRef]
- H. Baris Atakan, T. Alkanat, M. Cornaglia, R. Trouillon, and M. A. M. Gijs, "Automated phenotyping of Caenorhabditis elegans embryos with a high-throughput-screening microfluidic platform," Microsyst Nanoeng, vol. 6, p. 24, 2020. [CrossRef]
- W. A. Boyd, M. V. Smith, G. E. Kissling, and J. H. Freedman, "Medium- and high-throughput screening of neurotoxicants using C. elegans," Neurotoxicol Teratol, vol. 32, no. 1, pp. 68-73, Jan-Feb 2010. [CrossRef]
- P. R. Hunt, "The C. elegans model in toxicity testing," J Appl Toxicol, vol. 37, no. 1, pp. 50-59, Jan 2017. [CrossRef]
- S. Yoon et al., "Microfluidics in High-Throughput Drug Screening: Organ-on-a-Chip and C. elegans-Based Innovations," Biosensors (Basel), vol. 14, no. 1, Jan 21 2024. [CrossRef]
- DuPlissis et al., "Machine learning-based analysis of microfluidic device immobilized C. elegans for automated developmental toxicity testing," Sci Rep, vol. 15, no. 1, p. 15, Jan 2 2025. [CrossRef]
- L. Nigamatzyanova and R. Fakhrullin, "Dark-field hyperspectral microscopy for label-free microplastics and nanoplastics detection and identification in vivo: A Caenorhabditis elegans study," Environ Pollut, vol. 271, p. 116337, Feb 15 2021. [CrossRef]
- S. Verdu, C. Fuentes, J. M. Barat, and R. Grau, "Characterisation of chemical damage on tissue structures by multispectral imaging and machine learning procedures: Alkaline hypochlorite effect in C. elegans," Comput Biol Med, vol. 145, p. 105477, Jun 2022. [CrossRef]
- J. Dybiec, M. Szlagor, E. Mlynarska, J. Rysz, and B. Franczyk, "Structural and Functional Changes in Aging Kidneys," Int J Mol Sci, vol. 23, no. 23, Dec 6 2022. [CrossRef]
- J. Johnston, W. B. Iser, D. K. Chow, I. G. Goldberg, and C. A. Wolkow, "Quantitative image analysis reveals distinct structural transitions during aging in Caenorhabditis elegans tissues," PLoS One, vol. 3, no. 7, p. e2821, Jul 30 2008. [CrossRef]
- Z. Bao, J. I. Murray, T. Boyle, S. L. Ooi, M. J. Sandel, and R. H. Waterston, "Automated cell lineage tracing in Caenorhabditis elegans," Proc Natl Acad Sci U S A, vol. 103, no. 8, pp. 2707-12, Feb 21 2006. [CrossRef]
- Z. Aydin, J. I. Murray, R. H. Waterston, and W. S. Noble, "Using machine learning to speed up manual image annotation: application to a 3D imaging protocol for measuring single cell gene expression in the developing C. elegans embryo," BMC Bioinformatics, vol. 11, p. 84, Feb 11 2010. [CrossRef]
- J. Cao et al., "Establishment of a morphological atlas of the Caenorhabditis elegans embryo using deep-learning-based 4D segmentation," Nat Commun, vol. 11, no. 1, p. 6254, Dec 7 2020. [CrossRef]
- Y. Li et al., "Automated segmentation and recognition of C. elegans whole-body cells," Bioinformatics, vol. 40, no. 5, May 2 2024. [CrossRef]
- Y. Setty, "Multi-scale computational modeling of developmental biology," Bioinformatics, vol. 28, no. 15, pp. 2022-8, Aug 1 2012. [CrossRef]
- Z. Wang et al., "An Observation-Driven Agent-Based Modeling and Analysis Framework for C. elegans Embryogenesis," PLoS One, vol. 11, no. 11, p. e016. [CrossRef]
- Wang, Z. Wang, X. Zhao, Y. Xu, and Z. Bao, "An Observation Data Driven Simulation and Analysis Framework for Early Stage C. elegans Embryogenesis," J Biomed Sci Eng, vol. 11, no. 8, pp. 225-234, Aug 2018. [CrossRef]
- Z. Wang, D. Wang, C. Li, Y. Xu, H. Li, and Z. Bao, "Deep reinforcement learning of cell movement in the early stage of C.elegans embryogenesis," Bioinformatics, vol. 34, no. 18, pp. 3169-3177, Sep 15 2018. [CrossRef]
- R. M. Zellag, Y. Zhao, V. Poupart, R. Singh, J. C. Labbe, and A. R. Gerhold, "CentTracker: a trainable, machine-learning-based tool for large-scale analyses of Caenorhabditis elegans germline stem cell mitosis," Mol Biol Cell, vol. 32, no. 9, pp. 915-930, Apr 19 2021. [CrossRef]
- D. Wang, Z. Lu, Y. Xu, Z. I. Wang, A. Santella, and Z. Bao, "Cellular structure image classification with small targeted training samples," IEEE Access, vol. 7, pp. 148967-148974, 2019. [CrossRef]
- M. Kore, D. Acharya, L. Sharma, S. S. Vembar, and S. Sundriyal, "Development and experimental validation of a machine learning model for the prediction of new antimalarials," BMC Chem, vol. 19, no. 1, p. 28, Jan 30 2025. [CrossRef]
- P. Godec et al., "Democratized image analytics by visual programming through integration of deep models and small-scale machine learning," Nat Commun, vol. 10, no. 1, p. 4551, Oct 7 2019. [CrossRef]




| Sl. No | Phenotype | Input data | Machine learning model | Pros | Cons | Reference |
|---|---|---|---|---|---|---|
| 1 | Developmental stage classification (eggs, larvae, adult) | High-resolution image datasets (brightfield microscopy) | SVM | High precision for adults, reduces human errors | Low precision for eggs and larvae | [29] |
| 2 | 3D worm body structure, key morphological traits | Stacked 2D confocal or widefield microscopy images | Customized machine learning pipeline with noise reduction and segmentation | Accurate 3D reconstructions, applicable to drug screens | Limited real-time dynamic phenotyping | [30] |
| 3 | Physiological age estimation | Brightfield images of worms across 14-day lifespan | CNN (InceptionResNetV2) | Granular day-level age prediction | Potential bias due to manual preprocessing | [31] |
| 4 | Sex determination (male, hermaphrodite) | High-contrast fluorescence and morphological images | SVM with PCA and t-SNE for dimensionality reduction | High sexual classification accuracy | Memory constraints for large image files | [36] |
| 5 | Dynamic phenotypic changes during development | Brightfield and fluorescence microscopy images | CNN and Mask-RCNN | Reduces manual interventions | Limited for worms with extreme morphologies | [37] |
| 6 | Embryonic developmental stages, motility, and viability states | Brightfield and fluorescent image patches of embryos | AlexNet-based CNN with standard image processing | Rapid phenotyping of embryos, suitable for large-scale screenings, reduces manual interventions | Requires high-performance GPUs and is sensitive to labelled data quality and quantity | [38] |
| 7 | Morphological and developmental changes due to toxins | High-resolution brightfield and fluorescence images | 2.5D U-Net for segmentation | Low variability, high reproducibility | Requires high-performance GPUs and memory | [42] |
| 8 | Tissue damage, egg viability under stress conditions | Multispectral images (450-950 nm) of worms and eggs | PCA, SVM-DA (Discriminant Analysis) | Non-invasive imaging with high specificity | Sophisticated imaging systems needed | [44] |
| 9 | Pharynx structure changes across lifespan | DIC microscopy images of pharynx tissue | Pattern recognition-based machine learning algorithm | Quantitative insights into structural aging | Limited to pharynx tissue | [46] |
| 10 | Cell lineage development, nuclear divisions | 3D confocal microscopy images of embryos | SVM classifier integrated with StarryNite software | Reduces errors and manual curation time | Does not address false negatives | [48] |
| 11 | Cell shape, volume, surface area, nucleus position, and spatial organization | 3D time-lapse confocal microscopy images of embryos (4-cell to 350-cell stages) | DMapNet deep learning model (distance map-based segmentation) | Generates comprehensive 3D morphological atlas, high accuracy in densely packed cellular environments | Requires significant computational resources and lacks a user-friendly visualization platform | [49] |
| 12 | Whole-body cell identification and segmentation | 3D fluorescence microscopy images | DVF-based deep learning model | Adaptable to other animal models | Requires extensive statistical priors | [50] |
| 13 | Cell migration, division, fate determination | Time-lapse 3D confocal microscopy images | ABM combined with reinforcement learning | Provides cellular behavior insights | High computational requirements | [53,54] |
| 14 | Germline stem cell division dynamics | Live imaging of germline stem cells | Random forest-based track pair classifier | Spatial clustering analysis of GSCs | Performance drops in noisy datasets | [55] |
| 15 | Detection of multicellular rosette structures | 3D live images with fluorescently labeled cell membranes | GAN-based deep learning model with feature transfer | Efficient classification with small datasets | Performance depends on high-performance GPUs | [56] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
