Submitted:
01 July 2023
Posted:
05 July 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Research Questions
- RQ1: What applications can AI empowers in the mice behavior analysis studies? (Answered in Section 3)
- RQ2: How to taxonomize the applications into AI tasks? (Answered in Section 3)
- RQ3: What AI methods can be used for executing AI tasks? (Answered in Section 4)
- RQ4: How can MiceGPT trains the AI methods, classify the AI tasks, and identify the applications? (Answered in Section 5)
3. Applications
- Including studies whose data are videos or video frames;
- Including studies that have exact application goals instead of technical goals;
- Excluding studies that focus on machine learning instead of deep learning;
3.1. Disease Detection
3.2. External Stimuli Effective Assessment
3.3. Social Behavior Analysis
3.4. Neurobehavioral Assessment
3.5. AI Tasks Taxonomy
4. AI-empowered Approaches
4.1. AI Pyramid
4.2. Backbone
4.3. Fundamental Layer Tasks
4.3.1. Image Classification
| Architecture | Type | Category | Dataset | Performance | |
|---|---|---|---|---|---|
| [52] | AlexNet,C3D | Mice | Supervised learning | Private | The model not only provides more accurate annotations than alternate automatic methods, but also provides reliable annotations that can replace human annotations for neuroscience experiments. |
| [55] | ResNet18 | Mice | Supervised learning | Private | The CNN allows accurate and automated classification of freezing behavior throughout the duration of our experiments with minimal labor |
| [65] | ResNet50 | Stomach | Semi-supervised learning | Private,Kvasir [69] | The classification accuracy is 92.57%, which is better than that of the other state-of-the-art semi-supervised methods and is also higher than the classification method based on transfer learning by 2.28%. |
| [66] | Transformer | Remote sensing | Self-supervised learning | Private | The generative self-supervised model achieves superior performance in terms of feature learning and land cover classification, especially in the small sample classification case. |
| [67] | ResNet18 | Retina | Self-supervised learning | Ichallenge-AMD dataset [70], Ichallenge-PM dataset [71] | The method outperforms other self-supervised feature learning methods (around 4.2% area under the curve and can surpass the supervised baseline for pathologic myopia |
| [68] | ResNet18 | Dental caries | Self-supervised learning | Private | Using as few as 18 annotations can produce 45% sensitivity, which is comparable to human-level diagnostic performance |
4.3.2. Object Detection
4.3.3. Semantic Segmentation
| Reference | Architecture | Type | Category | Dataset | Performance |
|---|---|---|---|---|---|
| [27] | YoloV3 | Mice | CNN-based | Open Images dataset | Achieves a performance of 97.2% in terms of accuracy |
| [78] | DCNN based on U-Net | Mice | CNN-based | MOST dataset | Improves the network performance by about 3–10% |
| [10] | - | Mice | - | Private | Achieves an overall accuracy of 0.92 ± 0.05 (mean ± SD) |
| [80] | Context Encoding Network based on ResNet | Semantic segmentation framework | CNN-based | CIFAR-10 dataset | Achieves an error rate of 3.45% |
| [81] | DCNN (VGG-16 or ResNet-101) | Semantic image segmentation model | CNN-based | PASCAL VOC 2012, PASCAL-Context, PASCALPerson-Part, and Cityscapes dataset | Reaching 79.7 percent mIOU |
| [82] | DenseASPP, consists of a base network followed by a cascade of atrous convolution layers | Semantic image segmentation in autonomous driving | CNN-based | Cityscapes dataset | Achieve state-of-the-art performance |
| [83] | Transformer | Segmentation model | Transformer-based | ADE20K, Pascal Context, and Cityscapes dataset | Achieves new state of the art on ADE20K (50.28% mIoU), Pascal Context (55.83% mIoU) and competitive results on Cityscapes |
| [84] | Vision Transformer | Segmentation model | Transformer-based | ADE20K, Pascal Context, and Cityscapes dataset | Outperforms the state of the art on both ADE20K and Pascal Context datasets and is competitive on Cityscapes |
| [85] | Spatial-shift MLP (S2-MLP), containing only channel-mixing MLPs | Segmentation model | MLP-based | ImageNet-1K dataset | Attains considerably higher recognition accuracy than MLP-mixer on ImageNet-1K dataset. |
4.3.4. Instance Segmentation
| Reference | Architecture | Type | Category | Dataset | Performance |
|---|---|---|---|---|---|
| [29] | Mask R-CNN | Mice | Top-down method | Private | SIPEC successfully recognizes multiple behaviours of freely moving individual mice as well as socially interacting non-human primates in three dimensions |
| [86] | PDSL framework | - | Top-down method | PASCAL VOC 2012 [91], MS COCO [92] | PDSL framework outperforms baselines and achieves state-of-the-art results on PASCAL VOC and MS COCO. |
| [87] | Mask R-CNN | Cell | Top-down method | Private | The proposed architecture clearly outperforms a state-of-the-art Mask R-CNN approach for cell detection and segmentation with relative mean average precision improvements ofup to23.88% and 23.17%, respectively. |
| [88] | ResNet101 | Human | Bottom-up method | MHPv2 [93], DensePose-COCO [94], PASCAL-Person-Part [95] | Experiments on three instance-aware human parsing datasets show that the proposed model outperforms other bottom-up alternatives with much more efficient inference. |
| [89] | ResNet50 | - | Top-down method | LVIS [96] | The proposed framework achieves state-of-the-art results for instance segmentation in terms of both speed and accuracy, while being considerably simpler than the existing methods. |
| [90] | ResNet | - | Bottom-up method | COCO2017, Cityscapes [97] | The average segmentation accuracy on COCO2017 and Cityscapes reached 56% and 47.3% respectively, marking an increase of 4.4% and 7.4% over the performance of the original SOLO network. |
4.4. Middle Layer Tasks
4.4.1. Key Point Detection
4.4.2. Pose Estimation
| Reference | Architecture | Type | Category | Dataset | Performance |
|---|---|---|---|---|---|
| [102] | Hourglass network | Mice | 2D | Parkinson’s Disease Mouse Behaviour | The superior performance over the other state-of-the-art methods in terms of PCK@0.2 score. |
| [103] | ResNet, ASPP | Mice | 2D | Private | Overall performance has achieved superior performance at various thresholds |
| [43] | Structured forests | Mice | 3D | Private | Precision 86% |
| [105] | HRNet | Human | 2D | COCO, MPII human pose estimation, and PoseTrack dataset | Achieves a 92.3 PCKh@0.5 score |
| [106] | HigherHRNet | Human | 2D | COCO dataset | Achieves new state-of-the-art result on COCO test-dev (70.5% AP), surpasses all top-down methods on CrowdPose test (67.6% AP) |
| [107] | Lite-HRNet | Human | 2D | COCO and MPII human pose estimation datasets | Achieves 87.0 PCKh @0.5 |
| [108] | ResNet-152 | Human | 3D | Human3.6M and CMU Panoptic datasets | Achieve state-of-the-art performance on the Human3.6M dataset |
| [109] | ResNet-50 | Human | 3D | InterHand and Human3.6M datasets | Outperforms state-of-the-art by 4.23mm and achieves MPJPE 26.9 mm |
| [110] | ResNet-50 | Human | 3D | MPII, MuPoTs-3D, and RenderedH datasets | Outperforms the same whole-body model while staying close to the performance of the experts, less demanding than the ensemble of experts and can achieve real-time performance |
4.5. Top Layer Tasks
4.5.1. Object Tracking
| Reference | Architecture | Type | Category | Dataset | Performance |
|---|---|---|---|---|---|
| [29] | Mask R-CNN | Mice | - | Private | SIPEC:SegNet robustly segment animals despite occlusions, multiple scales and rapid movement, and enable tracking of animal identities within a session. |
| [32] | ResNet | Mice | - | Private | DeepLabCut can estimate the positions of mouse body parts. |
| [46] | ResNet | Mice | - | Private | Automated tracking of a dam and one pup was established in DeepLabCut and was combined with automated behavioral classification of “maternal approach”, “carrying” and “digging” in Simple Behavioral Analysis (SimBA). |
| [111] | Faster R-CNN, ResNet-50 | Human | Single-branch | MOT16 [116], MOT20 [117] | Compared with the two-stage methods on MOT16 and MOT20 datasets, the model achieves a new state-of-the-art performance even in crowded tracking scenes. |
| [113] | DNN | Vehicle | Multi-branch | Kitti [118] | The dualbranch classifier consistently outperforms previous single-branch approaches, improving or directly competing to other state of the art LiDAR-based methods. |
| [115] | ResNet50 | - | Multi-branch | VOT-2018 [119], VOT-2019 [120], OTB-100 [121], UAV123 [122], GOT10k [123], LASOT [124] | MultiBSP can achieve robust tracking and have state-of-the-art performance and the effectiveness of each module and the tracking stability is proved by qualitative and quantitative analyses. |
4.5.2. Action Recognition
| Reference | Architecture | Type | Category | Dataset | Performance |
|---|---|---|---|---|---|
| [38] | Hourglass network | Mice | video-based | Private | Provide a robust computational pipeline for the analysis of social behavior in pairs of interacting mice |
| [125] | 3D ConvNet, LSTM network | Mice | video-based | Private | Obtain accuracy on par with human assessment |
| [126] | LSTM | Mice | video-based | Private | Producing errors of 3.08%, 14.81%, and 7.4% on the training, validation, and testing sets respectively |
| [127] | 2D CNN | Human | video-based | UCF101 and HMDB51 datasets | Outperforms other compared state-of-the-art models |
| [129] | 2D CNN | Human | video-based | UCF101 and HMDB51 datasets | Improve the recognition performance of LR video from 42.81% to 53.59% on spatial stream and from 56.54% to 61.5% on temporal stream. |
| [131] | RNN | Human | video-based | UCF101 and HMDB51 datasets | Outperforms the state-of-the-art approaches for action recognition |
| [133] | 3D CNN | Human | video-based | Kinetics-600, Kinetics-400, mini-Kinetics, Something-Something V2, UCF101, and HMDB51 datasets | SGS decreases the computation cost (GFLOPS) between 33% and 53% without compromising accuracy. |
| [134] | RNN | Human | skeleton-based | Penn Treebank (PTB-c), and NTU RGB+D datasets | Performs much better than the traditional RNN, LSTM, and Transformer models on sequential MNIST classification, language modeling, and action recognition tasks. |
| [135] | CNN | Human | skeleton-based | NTU RGB+D, the SYSU Human-Object Interaction, the UWA3D, the Northwestern-UCLA, and the SBU Kinect Interaction datasets | Superior performance over state-of-the-art approaches |
| [136] | GCN | Human | skeleton-based | NTU RGB+D 60 and 120 datasets | Outperforms other SOTA methods |
4.5.3. Action Prediction
5. MiceGPT Design
5.1. Fundamental Architecture Design
5.2. AI-empowered Query Layer
5.3. AI-empowered Application Layer
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| Abbreviation | Full Name |
| AI | Artificial Intelligence |
| CNN | Convolutional Neural Network |
| RNN | Recurrent Neural Network |
| CRNN | Convolutional Recurrent Neural Network |
| FC | Fully Connected |
| SVMs | Support Vector Machines |
| KNN | K-nearest Neighbours |
| MARS | Mouse Action Recognition System |
| RPN | Region Propose Network |
| ROI | Region of Interest |
| IoU | Intersection over Union |
| RNN | Recurrent Neural Networks |
| GCN | Graph Convolution Network |
References
- Manavalan. ; Basith.; Shin.; Lee.; Wei.; Lee. 4mCpred-EL: An Ensemble Learning Framework for Identification of DNA N4-methylcytosine Sites in the Mouse Genome. Cells 2019, 8, 1332. [Google Scholar] [CrossRef] [PubMed]
- Koehler, C.C.; Hall, L.M.; Hellmer, C.B.; Ichinose, T. Using Looming Visual Stimuli to Evaluate Mouse Vision. Journal of Visualized Experiments 2019, 148, 59766. [Google Scholar] [CrossRef]
- Taherzadeh, G.; Yang, Y.; Xu, H.; Xue, Y.; Liew, A.W.C.; Zhou, Y. Predicting Lysine-Malonylation Sites of Proteins Using Sequence and Predicted Structural Features. Journal of Computational Chemistry 2018, 39, 1757–1763. [Google Scholar] [CrossRef] [PubMed]
- Pearson, B.L.; Defensor, E.B.; Blanchard, D.C.; Blanchard, R.J. C57BL/6J Mice Fail to Exhibit Preference for Social Novelty in the Three-Chamber Apparatus. Behavioural Brain Research 2010, 213, 189–194. [Google Scholar] [CrossRef]
- Kulesskaya, N.; Voikar, V. Assessment of Mouse Anxiety-like Behavior in the Light–Dark Box and Open-Field Arena: Role of Equipment and Procedure. Physiology & Behavior 2014, 133, 30–38. [Google Scholar] [CrossRef]
- Seo, M.K.; Jeong, S.; Seog, D.H.; Lee, J.A.; Lee, J.H.; Lee, Y.; McIntyre, R.S.; Park, S.W.; Lee, J.G. Effects of Liraglutide on Depressive Behavior in a Mouse Depression Model and Cognition in the Probe Trial of Morris Water Maze Test. Journal of Affective Disorders 2023, 324, 8–15. [Google Scholar] [CrossRef]
- Bohnslav, J.P.; Wimalasena, N.K.; Clausing, K.J.; Dai, Y.Y.; Yarmolinsky, D.A.; Cruz, T.; Kashlan, A.D.; Chiappe, M.E.; Orefice, L.L.; Woolf, C.J.; Harvey, C.D. DeepEthogram, a Machine Learning Pipeline for Supervised Behavior Classification from Raw Pixels. eLife 2021, 10, e63377. [Google Scholar] [CrossRef]
- Egnor, S.R.; Branson, K. Computational Analysis of Behavior. Annual Review of Neuroscience 2016, 39, 217–236. [Google Scholar] [CrossRef]
- Alexandrov, V.; Brunner, D.; Menalled, L.B.; Kudwa, A.; Watson-Johnson, J.; Mazzella, M.; Russell, I.; Ruiz, M.C.; Torello, J.; Sabath, E.; Sanchez, A.; Gomez, M.; Filipov, I.; Cox, K.; Kwan, M.; Ghavami, A.; Ramboz, S.; Lager, B.; Wheeler, V.C.; Aaronson, J.; Rosinski, J.; Gusella, J.F.; MacDonald, M.E.; Howland, D.; Kwak, S. Large-Scale Phenome Analysis Defines a Behavioral Signature for Huntington’s Disease Genotype in Mice. Nature Biotechnology 2016, 34, 838–844. [Google Scholar] [CrossRef]
- Geuther, B.; Chen, M.; Galante, R.J.; Han, O.; Lian, J.; George, J.; Pack, A.I.; Kumar, V. High-Throughput Visual Assessment of Sleep Stages in Mice Using Machine Learning. Sleep 2022, 45, zsab260. [Google Scholar] [CrossRef]
- Taecharungroj, V. "What Can ChatGPT Do?" analyzing early reactions to the innovative AI chatbot on twitter. Big Data Cogn. Comput. 2023, 7, 35. [Google Scholar] [CrossRef]
- Vogel-Ciernia, A.; Matheos, D.P.; Barrett, R.M.; Kramár, E.A.; Azzawi, S.; Chen, Y.; Magnan, C.N.; Zeller, M.; Sylvain, A.; Haettig, J.a. The neuron-specific chromatin regulatory subunit BAF53b is necessary for synaptic plasticity and memory. Nature Neuroscience 2015, 16, 552–61. [Google Scholar] [CrossRef]
- Kalueff, A.V.; Stewart, A.M.; Song, C.; Berridge, K.C.; Graybiel, A.M.; Fentress, J.C. Neurobiology of Rodent Self-Grooming and Its Value for Translational Neuroscience. Nature Reviews Neuroscience 2016, 17, 45–59. [Google Scholar] [CrossRef]
- Houle, D.; Govindaraju, D.R.; Omholt, S. Phenomics: The next Challenge. Nature Reviews Genetics 2010, 11, 855–866. [Google Scholar] [CrossRef] [PubMed]
- Lee, K.; Park, I.; Bishayee, K.; Lee, U. Machine-Learning Based Automatic and Real-Time Detection of Mouse Scratching Behaviors. IBRO Reports 2019, 6, S414–S415. [Google Scholar] [CrossRef]
- Sakamoto, N.; Haraguchi, T.; Kobayashi, K.; Miyazaki, Y.; Murata, T. Automated Scratching Detection System for Black Mouse Using Deep Learning. Frontiers in Physiology 2022, 13, 939281. [Google Scholar] [CrossRef]
- Viglione, A.; Sagona, G.; Carrara, F.; Amato, G.; Totaro, V.; Lupori, L.; Putignano, E.; Pizzorusso, T.; Mazziotti, R. Behavioral Impulsivity Is Associated with Pupillary Alterations and Hyperactivity in CDKL5 Mutant Mice. Human Molecular Genetics 2022, 31, 4107–4120. [Google Scholar] [CrossRef]
- Yu, H.; Xiong, J.; Ye, A.Y.; Cranfill, S.L.; Cannonier, T.; Gautam, M.; Zhang, M.; Bilal, R.; Park, J.E.; Xue, Y.; Polam, V.; Vujovic, Z.; Dai, D.; Ong, W.; Ip, J.; Hsieh, A.; Mimouni, N.; Lozada, A.; Sosale, M.; Ahn, A.; Ma, M.; Ding, L.; Arsuaga, J.; Luo, W. Scratch-AID, a Deep Learning-Based System for Automatic Detection of Mouse Scratching Behavior with High Accuracy. eLife 2022, 11, e84042. [Google Scholar] [CrossRef]
- Weber, R.Z.; Mulders, G.; Kaiser, J.; Tackenberg, C.; Rust, R. Deep Learning-Based Behavioral Profiling of Rodent Stroke Recovery. BMC Biology 2022, 20, 232. [Google Scholar] [CrossRef]
- Aljovic, A.; Zhao, S.; Chahin, M.; De La Rosa, C.; Van Steenbergen, V.; Kerschensteiner, M.; Bareyre, F.M. A Deep Learning-Based Toolbox for Automated Limb Motion Analysis (ALMA) in Murine Models of Neurological Disorders. Communications Biology 2022, 5, 131. [Google Scholar] [CrossRef]
- Mathis, A.; Mamidanna, P.; Cury, K.M.; Abe, T.; Murthy, V.N.; Mathis, M.W.; Bethge, M. DeepLabCut: Markerless Pose Estimation of User-Defined Body Parts with Deep Learning. Nature Neuroscience 2018, 21, 1281–1289. [Google Scholar] [CrossRef] [PubMed]
- Cai, H.; Luo, Y.; Yan, X.; Ding, P.; Huang, Y.; Fang, S.; Zhang, R.; Chen, Y.; Guo, Z.; Fang, J.; Wang, Q.; Xu, J. The Mechanisms of Bushen-Yizhi Formula as a Therapeutic Agent against Alzheimer’s Disease. Scientific Reports 2018, 8, 3104. [Google Scholar] [CrossRef] [PubMed]
- Iino, Y.; Sawada, T.; Yamaguchi, K.; Tajiri, M.; Ishii, S.; Kasai, H.; Yagishita, S. Dopamine D2 Receptors in Discrimination Learning and Spine Enlargement. Nature 2020, 579, 555–560. [Google Scholar] [CrossRef] [PubMed]
- Merlini, M.; Rafalski, V.A.; Rios Coronado, P.E.; Gill, T.M.; Ellisman, M.; Muthukumar, G.; Subramanian, K.S.; Ryu, J.K.; Syme, C.A.; Davalos, D.; Seeley, W.W.; Mucke, L.; Nelson, R.B.; Akassoglou, K. Fibrinogen Induces Microglia-Mediated Spine Elimination and Cognitive Impairment in an Alzheimer’s Disease Model. Neuron 2019, 101, 1099–1108.e6. [Google Scholar] [CrossRef]
- Wotton, J.M.; Peterson, E.; Anderson, L.; Murray, S.A.; Braun, R.E.; Chesler, E.J.; White, J.K.; Kumar, V. Machine Learning-Based Automated Phenotyping of Inflammatory Nocifensive Behavior in Mice. Molecular Pain 2020, 16, 174480692095859. [Google Scholar] [CrossRef]
- Kathote, G.; Ma, Q.; Angulo, G.; Chen, H.; Jakkamsetti, V.; Dobariya, A.; Good, L.B.; Posner, B.; Park, J.Y.; Pascual, J.M. Identification of Glucose Transport Modulators In Vitro and Method for Their Deep Learning Neural Network Behavioral Evaluation in Glucose Transporter 1–Deficient Mice. Journal of Pharmacology and Experimental Therapeutics 2023, 384, 393–405. [Google Scholar] [CrossRef]
- Vidal, A.; Jha, S.; Hassler, S.; Price, T.; Busso, C. Face Detection and Grimace Scale Prediction of White Furred Mice. Machine Learning with Applications 2022, 8, 100312. [Google Scholar] [CrossRef]
- Abdus-Saboor, I.; Fried, N.T.; Lay, M.; Burdge, J.; Swanson, K.; Fischer, R.; Jones, J.; Dong, P.; Cai, W.; Guo, X.; Tao, Y.X.; Bethea, J.; Ma, M.; Dong, X.; Ding, L.; Luo, W. Development of a Mouse Pain Scale Using Sub-second Behavioral Mapping and Statistical Modeling. Cell Reports 2019, 28, 1623–1634.e4. [Google Scholar] [CrossRef]
- Marks, M.; Jin, Q.; Sturman, O.; Von Ziegler, L.; Kollmorgen, S.; Von Der Behrens, W.; Mante, V.; Bohacek, J.; Yanik, M.F. Deep-Learning-Based Identification, Tracking, Pose Estimation and Behaviour Classification of Interacting Primates and Mice in Complex Environments. Nature Machine Intelligence 2022, 4, 331–340. [Google Scholar] [CrossRef]
- Torabi, R.; Jenkins, S.; Harker, A.; Whishaw, I.Q.; Gibb, R.; Luczak, A. A Neural Network Reveals Motoric Effects of Maternal Preconception Exposure to Nicotine on Rat Pup Behavior: A New Approach for Movement Disorders Diagnosis. Frontiers in Neuroscience 2021, 15, 686767. [Google Scholar] [CrossRef]
- Martins, T.M.; Brown Driemeyer, J.P.; Schmidt, T.P.; Sobieranski, A.C.; Dutra, R.C.; Oliveira Weber, T. A Machine Learning Approach to Immobility Detection in Mice during the Tail Suspension Test for Depressive-Type Behavior Analysis. Research on Biomedical Engineering 2022, 39, 15–26. [Google Scholar] [CrossRef]
- Wang, J.; Karbasi, P.; Wang, L.; Meeks, J.P. A Layered, Hybrid Machine Learning Analytic Workflow for Mouse Risk Assessment Behavior. eneuro 2023, 10, ENEURO.0335–22.2022. [Google Scholar] [CrossRef] [PubMed]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Las Vegas, NV, USA, 2016; pp. 770–778. [Google Scholar] [CrossRef]
- Bermudez Contreras, E.; Sutherland, R.J.; Mohajerani, M.H.; Whishaw, I.Q. Challenges of a Small World Analysis for the Continuous Monitoring of Behavior in Mice. Neuroscience & Biobehavioral Reviews 2022, 136, 104621. [Google Scholar] [CrossRef]
- Gharagozloo, M.; Amrani, A.; Wittingstall, K.; Hamilton-Wright, A.; Gris, D. Machine Learning in Modeling of Mouse Behavior. Frontiers in Neuroscience 2021, 15, 700253. [Google Scholar] [CrossRef]
- Van Dam, E.A.; Noldus, L.P.; Van Gerven, M.A. Deep Learning Improves Automated Rodent Behavior Recognition within a Specific Experimental Setup. Journal of Neuroscience Methods 2020, 332, 108536. [Google Scholar] [CrossRef]
- Robie, A.A.; Seagraves, K.M.; Egnor, S.E.R.; Branson, K. Machine Vision Methods for Analyzing Social Interactions. Journal of Experimental Biology 2017, 220, 25–34. [Google Scholar] [CrossRef]
- Segalin, C.; Williams, J.; Karigo, T.; Hui, M.; Zelikowsky, M.; Sun, J.J.; Perona, P.; Anderson, D.J.; Kennedy, A. The Mouse Action Recognition System (MARS) Software Pipeline for Automated Analysis of Social Behaviors in Mice. eLife 2021, 10, e63720. [Google Scholar] [CrossRef]
- Agbele, T.; Ojeme, B.; Jiang, R. Application of Local Binary Patterns and Cascade AdaBoost Classifier for Mice Behavioural Patterns Detection and Analysis. Procedia Computer Science 2019, 159, 1375–1386. [Google Scholar] [CrossRef]
- Jiang, Z.; Zhou, F.; Zhao, A.; Li, X.; Li, L.; Tao, D.; Li, X.; Zhou, H. Multi-View Mouse Social Behaviour Recognition With Deep Graphic Model. IEEE Transactions on Image Processing 2021, 30, 5490–5504. [Google Scholar] [CrossRef]
- Sheets, A.L.; Lai, P.L.; Fisher, L.C.; Basso, D.M. Quantitative Evaluation of 3D Mouse Behaviors and Motor Function in the Open-Field after Spinal Cord Injury Using Markerless Motion Tracking. PLoS ONE 2013, 8, e74536. [Google Scholar] [CrossRef]
- Burgos-Artizzu, X.P.; Dollár, P.; Lin, D.; Anderson, D.J.; Perona, P. Social Behavior Recognition in Continuous Video. 2012 IEEE Conference on Computer Vision and Pattern Recognition. 2012; 1322–1329. [Google Scholar] [CrossRef]
- Salem, G.; Krynitsky, J.; Hayes, M.; Pohida, T.; Burgos-Artizzu, X. Three-Dimensional Pose Estimation for Laboratory Mouse From Monocular Images. IEEE Transactions on Image Processing 2019, 28, 4273–4287. [Google Scholar] [CrossRef] [PubMed]
- Su, W.; Jiang, F.; Shi, C.; Wu, D.; Liu, L.; Li, S.; Yuan, Y.; Shi, J. An XGBoost-Based Knowledge Tracing Model. International Journal of Computational Intelligence Systems 2023, 16, 13. [Google Scholar] [CrossRef]
- Huang, X.; Li, Z.; Jin, Y.; Zhang, W. Fair-AdaBoost: Extending AdaBoost Method to Achieve Fair Classification. Expert Systems With Applications 2022, 202, 117240. [Google Scholar] [CrossRef]
- Winters, C.; Gorssen, W.; Ossorio-Salazar, V.A.; Nilsson, S.; Golden, S.; D’Hooge, R. Automated Procedure to Assess Pup Retrieval in Laboratory Mice. Scientific Reports 2022, 12, 1663. [Google Scholar] [CrossRef] [PubMed]
- Hong, W.; Kennedy, A.; Burgos-Artizzu, X.P.; Zelikowsky, M.; Navonne, S.G.; Perona, P.; Anderson, D.J. Automated Measurement of Mouse Social Behaviors Using Depth Sensing, Video Tracking, and Machine Learning. Proceedings of the National Academy of Sciences 2015, 112. [Google Scholar] [CrossRef]
- Tanas, J.K.; Kerr, D.D.; Wang, L.; Rai, A.; Wallaard, I.; Elgersma, Y.; Sidorov, M.S. Multidimensional Analysis of Behavior Predicts Genotype with High Accuracy in a Mouse Model of Angelman Syndrome. Translational Psychiatry 2022, 12, 426–434. [Google Scholar] [CrossRef]
- Yamamoto, M.; Motomura, E.; Yanagisawa, R.; Hoang, V.A.T.; Mogi, M.; Mori, T.; Nakamura, M.; Takeya, M.; Eto, K. Evaluation of Neurobehavioral Impairment in Methylmercury-Treated KK-Ay Mice by Dynamic Weight-Bearing Test: Neurobehavioral Disorders in Methylmercury-Treated Mice. Journal of Applied Toxicology 2019, 39, 221–230. [Google Scholar] [CrossRef]
- Delanogare, E.; Bullich, S.; Barbosa, L.A.D.S.; Barros, W.D.M.; Braga, S.P.; Kraus, S.I.; Kasprowicz, J.N.; Dos Santos, G.J.; Guiard, B.P.; Moreira, E.L.G. Metformin Improves Neurobehavioral Impairments of Streptozotocin-treated and Western Diet-fed Mice: Beyond Glucose-lowering Effects. Fundamental & Clinical Pharmacology 2023, 37, 94–106. [Google Scholar] [CrossRef]
- McMackin, M.Z.; Henderson, C.K.; Cortopassi, G.A. Neurobehavioral Deficits in the KIKO Mouse Model of Friedreich’s Ataxia. Behavioural Brain Research 2017, 316, 183–188. [Google Scholar] [CrossRef]
- Ren, Z.; Annie, A.N.; Ciernia, V.; Lee, Y.J. Who Moved My Cheese? Automatic Annotation of Rodent Behaviors with Convolutional Neural Networks. 2017 IEEE Winter Conference on Applications of Computer Vision (WACV); IEEE: Santa Rosa, CA, USA, 2017; pp. 1277–1286. [Google Scholar] [CrossRef]
- Jiang, Z.; Crookes, D.; Green, B.D.; Zhao, Y.; Ma, H.; Li, L.; Zhang, S.; Tao, D.; Zhou, H. Context-Aware Mouse Behavior Recognition Using Hidden Markov Models. IEEE Transactions on Image Processing 2019, 28, 1133–1148. [Google Scholar] [CrossRef]
- Tong, M.; Yu, X.; Shao, J.; Shao, Z.; Li, W.; Lin, W. Automated Measuring Method Based on Machine Learning for Optomotor Response in Mice. Neurocomputing 2020, 418, 241–250. [Google Scholar] [CrossRef]
- Cai, L.X.; Pizano, K.; Gundersen, G.W.; Hayes, C.L.; Fleming, W.T.; Holt, S.; Cox, J.M.; Witten, I.B. Distinct Signals in Medial and Lateral VTA Dopamine Neurons Modulate Fear Extinction at Different Times. eLife 2020, 9, e54936. [Google Scholar] [CrossRef] [PubMed]
- Jhuang, H.; Garrote, E.; Yu, X.; Khilnani, V.; Poggio, T.; Steele, A.D.; Serre, T. Correction: Corrigendum: Automated Home-Cage Behavioural Phenotyping of Mice. Nature Communications 2012, 3, 654. [Google Scholar] [CrossRef]
- Lara-Doña, A.; Torres-Sanchez, S.; Priego-Torres, B.; Berrocoso, E.; Sanchez-Morillo, D. Automated Mouse Pupil Size Measurement System to Assess Locus Coeruleus Activity with a Deep Learning-Based Approach. Sensors 2021, 21, 7106. [Google Scholar] [CrossRef] [PubMed]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection, 2016, [arxiv:cs/1506. 0 2640.
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition, 2015, [arxiv:cs/1512. 0 3385.
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, 2017, [arxiv:cs/1704. 0 4861.
- Newell, A.; Yang, K.; Deng, J. Stacked Hourglass Networks for Human Pose Estimation, 2016, [arxiv:cs/1603. 0 6937.
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. Scaled-YOLOv4: Scaling Cross Stage Partial Network, 2021, [arxiv:cs/2011. 0 8036.
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; Uszkoreit, J.; Houlsby, N. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale, 2021, [arxiv:cs/2010. 1 1929.
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows, 2021, [arxiv:cs/2103. 1 4030.
- Du, W.; Rao, N.; Yong, J.; Wang, Y.; Hu, D.; Gan, T.; Zhu, L.; Zeng, B. Improving the Classification Performance of Esophageal Disease on Small Dataset by Semi-supervised Efficient Contrastive Learning. Journal of Medical Systems 2022, 46, 4. [Google Scholar] [CrossRef] [PubMed]
- Xue, Z.; Yu, X.; Yu, A.; Liu, B.; Zhang, P.; Wu, S. Self-Supervised Feature Learning for Multimodal Remote Sensing Image Land Cover Classification. IEEE Transactions on Geoscience and Remote Sensing 2022, 60, 1–15. [Google Scholar] [CrossRef]
- Li, X.; Hu, X.; Qi, X.; Yu, L.; Zhao, W.; Heng, P.A.; Xing, L. Rotation-Oriented Collaborative Self-Supervised Learning for Retinal Disease Diagnosis. IEEE Transactions on Medical Imaging 2021, 40, 2284–2294. [Google Scholar] [CrossRef] [PubMed]
- Taleb, A.; Rohrer, C.; Bergner, B.; De Leon, G.; Rodrigues, J.A.; Schwendicke, F.; Lippert, C.; Krois, J. Self-Supervised Learning Methods for Label-Efficient Dental Caries Classification. Diagnostics 2022, 12, 1237. [Google Scholar] [CrossRef]
- Pogorelov, K.; Randel, K.R.; Griwodz, C.; Lange, T.D.; Halvorsen, P. KVASIR: A Multi-Class Image Dataset for Computer Aided Gastrointestinal Disease Detection. Acm on Multimedia Systems Conference, 2017.
- Fu, H.; Li, F.; Orlando, J.; Bogunovic, H.; Sun, X.; Liao, J.; Xu, Y.; Zhang, S.; Zhang, X. Adam: Automatic detection challenge on age-related macular degeneration. IEEE Dataport 2020. [Google Scholar]
- Fang, H.; Li, F.; Wu, J.; Fu, H.; Sun, X.; Orlando, J.I.; Bogunović, H.; Zhang, X.; Xu, Y. PALM: Open Fundus Photograph Dataset with Pathologic Myopia Recognition and Anatomical Structure Annotation. arXiv preprint arXiv:2305.07816, 2023; arXiv:2305.07816 2023. [Google Scholar]
- Hu, Y.; Ding, Z.; Ge, R.; Shao, W.; Huang, L.; Li, K.; Liu, Q. AFDetV2: Rethinking the Necessity of the Second Stage for Object Detection from Point Clouds. Proceedings of the AAAI Conference on Artificial Intelligence 2022, 36, 969–979. [Google Scholar] [CrossRef]
- Li, Z.; Tian, X.; Liu, X.; Liu, Y.; Shi, X. A Two-Stage Industrial Defect Detection Framework Based on Improved-YOLOv5 and Optimized-Inception-ResnetV2 Models. Applied Sciences 2022, 12, 834. [Google Scholar] [CrossRef]
- Sun, G.; Hua, Y.; Hu, G.; Robertson, N. Efficient One-Stage Video Object Detection by Exploiting Temporal Consistency. In Computer Vision – ECCV 2022; Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T., Eds.; Springer Nature Switzerland: Cham, 2022; Volume 13695, pp. 1–16. [Google Scholar] [CrossRef]
- Zhou, J.; Feng, K.; Li, W.; Han, J.; Pan, F. TS4Net: Two-stage Sample Selective Strategy for Rotating Object Detection. Neurocomputing 2022, 501, 753–764. [Google Scholar] [CrossRef]
- Zhou, Q.; Li, X.; He, L.; Yang, Y.; Cheng, G.; Tong, Y.; Ma, L.; Tao, D. TransVOD: End-to-End Video Object Detection With Spatial-Temporal Transformers. IEEE Transactions on Pattern Analysis and Machine Intelligence 2023, 45, 7853–7869. [Google Scholar] [CrossRef]
- Yu, F.; Koltun, V. Multi-Scale Context Aggregation by Dilated Convolutions, 2016, [arxiv:cs/1511. 0 7122.
- Wu, X.; Tao, Y.; He, G.; Liu, D.; Fan, M.; Yang, S.; Gong, H.; Xiao, R.; Chen, S.; Huang, J. Boosting Multilabel Semantic Segmentation for Somata and Vessels in Mouse Brain. Frontiers in Neuroscience 2021, 15, 610122. [Google Scholar] [CrossRef] [PubMed]
- Webb, J.M.; Fu, Y.H. Recent Advances in Sleep Genetics. Current Opinion in Neurobiology 2021, 69, 19–24. [Google Scholar] [CrossRef] [PubMed]
- Zhang, H.; Dana, K.; Shi, J.; Zhang, Z.; Wang, X.; Tyagi, A.; Agrawal, A. Context Encoding for Semantic Segmentation. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition; IEEE: Salt Lake City, UT, USA, 2018; pp. 7151–7160. [Google Scholar] [CrossRef]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence 2018, 40, 834–848. [Google Scholar] [CrossRef]
- Yang, M.; Yu, K.; Zhang, C.; Li, Z.; Yang, K. DenseASPP for Semantic Segmentation in Street Scenes. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition; IEEE: Salt Lake City, UT, USA, 2018; pp. 3684–3692. [Google Scholar] [CrossRef]
- Zheng, S.; Lu, J.; Zhao, H.; Zhu, X.; Luo, Z.; Wang, Y.; Fu, Y.; Feng, J.; Xiang, T.; Torr, P.H.; Zhang, L. Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Nashville, TN, USA, 2021; pp. 6877–6886. [Google Scholar] [CrossRef]
- Strudel, R.; Garcia, R.; Laptev, I.; Schmid, C. Segmenter: Transformer for Semantic Segmentation. 2021 IEEE/CVF International Conference on Computer Vision (ICCV); IEEE: Montreal, QC, Canada, 2021; pp. 7242–7252. [Google Scholar] [CrossRef]
- Yu, T.; Li, X.; Cai, Y.; Sun, M.; Li, P. S 2 -MLP: Spatial-Shift MLP Architecture for Vision. 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV); IEEE: Waikoloa, HI, USA, 2022; pp. 3615–3624. [Google Scholar] [CrossRef]
- Shen, Y.; Cao, L.; Chen, Z.; Zhang, B.; Su, C.; Wu, Y.; Huang, F.; Ji, R. Parallel detection-and-segmentation learning for weakly supervised instance segmentation. Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 8198–8208.
- Korfhage, N.; Mühling, M.; Ringshandl, S.; Becker, A.; Schmeck, B.; Freisleben, B. Detection and Segmentation of Morphologically Complex Eukaryotic Cells in Fluorescence Microscopy Images via Feature Pyramid Fusion. PLOS Computational Biology 2020, 16, e1008179. [Google Scholar] [CrossRef]
- Zhou, T.; Wang, W.; Liu, S.; Yang, Y.; Van Gool, L. Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Nashville, TN, USA, 2021; pp. 1622–1631. [Google Scholar] [CrossRef]
- Wang, X.; Zhang, R.; Shen, C.; Kong, T.; Li, L. SOLO: A Simple Framework for Instance Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 2021, pp. 1–1. [CrossRef]
- Li, B.r.; Zhang, J.k.; Liang, Y. PaFPN-SOLO: A SOLO-based Image Instance Segmentation Algorithm. 2022 Asia Conference on Algorithms, Computing and Machine Learning (CACML). IEEE, 2022, pp. 557–564.
- Everingham, M.; Van Gool, L.; Williams, C.K.; Winn, J.; Zisserman, A. The pascal visual object classes (voc) challenge. International journal of computer vision 2010, 88, 303–338. [Google Scholar] [CrossRef]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in context. Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer, 2014, pp. 740–755.
- Zhao, J.; Li, J.; Cheng, Y.; Zhou, L.; Sim, T.; Yan, S.; Feng, J. Understanding Humans in Crowded Scenes: Deep Nested Adversarial Learning and A New Benchmark for Multi-Human Parsing. ACM 2018. [Google Scholar]
- Güler, R.A.; Neverova, N.; Kokkinos, I. DensePose: Dense Human Pose Estimation In The Wild. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
- Xia, F.; Wang, P.; Chen, X.; Yuille, A.L. Joint Multi-person Pose Estimation and Semantic Part Segmentation. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Honolulu, HI, 2017; pp. 6080–6089. [Google Scholar] [CrossRef]
- Gupta, A.; Dollár, P.; Girshick, R. LVIS: A Dataset for Large Vocabulary Instance Segmentation. IEEE 2019. [Google Scholar]
- Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Schiele, B. The Cityscapes Dataset for Semantic Urban Scene Understanding. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- Wen, J.; Chi, J.; Wu, C.; Yu, X. Human Pose Estimation Based Pre-training Model and Efficient High-Resolution Representation. 2021 40th Chinese Control Conference (CCC); IEEE: Shanghai, China, 2021; pp. 8463–8468. [Google Scholar] [CrossRef]
- Gong, F.; Li, Y.; Yuan, X.; Liu, X.; Gao, Y. Human Elbow Flexion Behaviour Recognition Based on Posture Estimation in Complex Scenes. IET Image Processing 2023, 17, 178–192. [Google Scholar] [CrossRef]
- Zang, Y.; Fan, C.; Zheng, Z.; Yang, D. Pose Estimation at Night in Infrared Images Using a Lightweight Multi-Stage Attention Network. Signal, Image and Video Processing 2021, 15, 1757–1765. [Google Scholar] [CrossRef]
- Hong, F.; Lu, C.; Liu, C.; Liu, R.; Jiang, W.; Ju, W.; Wang, T. PGNet: Pipeline Guidance for Human Key-Point Detection. Entropy 2020, 22, 369. [Google Scholar] [CrossRef] [PubMed]
- Zhou, F.; Jiang, Z.; Liu, Z.; Chen, F.; Chen, L.; Tong, L.; Yang, Z.; Wang, H.; Fei, M.; Li, L.; Zhou, H. Structured Context Enhancement Network for Mouse Pose Estimation. IEEE Transactions on Circuits and Systems for Video Technology 2022, 32, 2787–2801. [Google Scholar] [CrossRef]
- Xu, Z.; Liu, R.; Wang, Z.; Wang, S.; Zhu, J. Detection of Key Points in Mice at Different Scales via Convolutional Neural Network. Symmetry 2022, 14, 1437. [Google Scholar] [CrossRef]
- Topham, L.K.; Khan, W.; Al-Jumeily, D.; Hussain, A. Human Body Pose Estimation for Gait Identification: A Comprehensive Survey of Datasets and Models. ACM Computing Surveys 2023, 55, 1–42. [Google Scholar] [CrossRef]
- Sun, K.; Xiao, B.; Liu, D.; Wang, J. Deep High-Resolution Representation Learning for Human Pose Estimation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Long Beach, CA, USA, 2019; pp. 5686–5696. [Google Scholar] [CrossRef]
- Cheng, B.; Xiao, B.; Wang, J.; Shi, H.; Huang, T.S.; Zhang, L. HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Seattle, WA, USA, 2020; pp. 5385–5394. [Google Scholar] [CrossRef]
- Yu, C.; Xiao, B.; Gao, C.; Yuan, L.; Zhang, L.; Sang, N.; Wang, J. Lite-HRNet: A Lightweight High-Resolution Network. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Nashville, TN, USA, 2021; pp. 10435–10445. [Google Scholar] [CrossRef]
- Iskakov, K.; Burkov, E.; Lempitsky, V.; Malkov, Y. Learnable Triangulation of Human Pose. 2019 IEEE/CVF International Conference on Computer Vision (ICCV); IEEE: Seoul, Korea (South), 2019; pp. 7717–7726. [Google Scholar] [CrossRef]
- He, Y.; Yan, R.; Fragkiadaki, K.; Yu, S.I. Epipolar Transformers. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Only, 2020; pp. 7779–7788. [Google Scholar]
- Weinzaepfel, P.; Brégier, R.; Combaluzier, H.; Leroy, V.; Rogez, G. DOPE: Distillation of Part Experts for Whole-Body 3D Pose Estimation in the Wild. In Computer Vision – ECCV 2020; Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M., Eds.; Springer International Publishing: Cham, 2020; Volume 12371, pp. 380–397. [Google Scholar] [CrossRef]
- Wang, F.; Luo, L.; Zhu, E.; Wang, S. Multi-Object Tracking with a Hierarchical Single-Branch Network. In MultiMedia Modeling; Þór Jónsson, B., Gurrin, C., Tran, M.T., Dang-Nguyen, D.T., Hu, A.M.C., Huynh Thi Thanh, B., Huet, B., Eds.; Springer International Publishing: Cham, 2022; Volume 13142, pp. 73–83. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
- Vaquero, V.; Del Pino, I.; Moreno-Noguer, F.; Sola, J.; Sanfeliu, A.; Andrade-Cetto, J. Dual-Branch CNNs for Vehicle Detection and Tracking on LiDAR Data. IEEE Transactions on Intelligent Transportation Systems 2021, 22, 6942–6953. [Google Scholar] [CrossRef]
- Geiger, A.; Lenz, P.; Urtasun, R. Are We Ready for Autonomous Driving? The KITTI Vision Benchmark Suite. 2012 IEEE Conference on Computer Vision and Pattern Recognition; IEEE: Providence, RI, 2012; pp. 3354–3361. [Google Scholar] [CrossRef]
- Jiang, J.; Yang, X.; Li, Z.; Shen, K.; Jiang, F.; Ren, H.; Li, Y. MultiBSP: Multi-Branch and Multi-Scale Perception Object Tracking Framework Based on Siamese CNN. Neural Computing and Applications 2022, 34, 18787–18803. [Google Scholar] [CrossRef]
- Milan, A.; Leal-Taixe, L.; Reid, I.; Roth, S.; Schindler, K. MOT16: A Benchmark for Multi-Object Tracking, 2016, [arxiv:cs/1603. 0 0831.
- Dendorfer, P.; Rezatofighi, H.; Milan, A.; Shi, J.; Cremers, D.; Reid, I.; Roth, S.; Schindler, K.; Leal-Taixé, L. MOT20: A benchmark for multi object tracking in crowded scenes. arXiv 2020. [Google Scholar]
- Geiger, A.; Lenz, P.; Stiller, C.; Urtasun, R. Vision meets robotics: The KITTI dataset. International Journal of Robotics Research 2013, 32, 1231–1237. [Google Scholar] [CrossRef]
- Kristan, M.; Leonardis, A.; Matas, J.; Felsberg, M.; Pflugfelder, R.; ˇCehovin Zajc, L.; Vojir, T.; Bhat, G.; Lukezic, A.; Eldesokey, A. ; others. The sixth visual object tracking vot2018 challenge results. Proceedings of the European conference on computer vision (ECCV) workshops, 2018, pp. 0–0.
- Kristan, M.; Berg, A.; Zheng, L.; Rout, L.; Zhou, L. The Seventh Visual Object Tracking VOT2019 Challenge Results. 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), 2019.
- Wu, Y.; Lim, J.; Yang, M.H. Object tracking benchmark. IEEE transactions on pattern analysis and machine intelligence 2015, 37, 1834–1848. [Google Scholar] [CrossRef] [PubMed]
- Mueller, M.; Smith, N.; Ghanem, B. A benchmark and simulator for uav tracking. Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer, 2016, pp. 445–461. 11 October.
- Huang, L.; Zhao, X.; Huang, K. Got-10k: A large high-diversity benchmark for generic object tracking in the wild. IEEE transactions on pattern analysis and machine intelligence 2019, 43, 1562–1577. [Google Scholar] [CrossRef]
- Fan, H.; Lin, L.; Yang, F.; Chu, P.; Deng, G.; Yu, S.; Bai, H.; Xu, Y.; Liao, C.; Ling, H. Lasot: A high-quality benchmark for large-scale single object tracking. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 5374–5383.
- Le, V.A.; Murari, K. Recurrent 3D Convolutional Network for Rodent Behavior Recognition. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); IEEE: Brighton, United Kingdom, 2019; pp. 1174–1178. [Google Scholar] [CrossRef]
- Kramida, G.; Aloimonos, Y.; Parameshwara, C.M.; Fermuller, C.; Francis, N.A.; Kanold, P. Automated Mouse Behavior Recognition Using VGG Features and LSTM Networks. Visual Observation and Analysis of Vertebrate And Insect Behavior;, 2016; pp. 1–3.
- Zong, M.; Wang, R.; Chen, X.; Chen, Z.; Gong, Y. Motion Saliency Based Multi-Stream Multiplier ResNets for Action Recognition. Image and Vision Computing 2021, 107, 104108. [Google Scholar] [CrossRef]
- Feichtenhofer, C.; Pinz, A.; Wildes, R.P. Spatiotemporal Multiplier Networks for Video Action Recognition. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Honolulu, HI, 2017; pp. 7445–7454. [Google Scholar] [CrossRef]
- Zhang, H.; Liu, D.; Xiong, Z. Two-Stream Action Recognition-Oriented Video Super-Resolution. 2019 IEEE/CVF International Conference on Computer Vision (ICCV); IEEE: Seoul, Korea (South), 2019; pp. 8798–8807. [Google Scholar] [CrossRef]
- Majd, M.; Safabakhsh, R. Correlational Convolutional LSTM for Human Action Recognition. Neurocomputing 2020, 396, 224–229. [Google Scholar] [CrossRef]
- He, J.Y.; Wu, X.; Cheng, Z.Q.; Yuan, Z.; Jiang, Y.G. DB-LSTM: Densely-connected Bi-directional LSTM for Human Action Recognition. Neurocomputing 2021, 444, 319–331. [Google Scholar] [CrossRef]
- Tran, D.; Bourdev, L.; Fergus, R.; Torresani, L.; Paluri, M. Learning Spatiotemporal Features with 3D Convolutional Networks. 2015 IEEE International Conference on Computer Vision (ICCV); IEEE: Santiago, Chile, 2015; pp. 4489–4497. [Google Scholar] [CrossRef]
- Fayyaz, M.; Bahrami, E.; Diba, A.; Noroozi, M.; Adeli, E.; Van Gool, L.; Gall, J. 3D CNNs with Adaptive Temporal Feature Resolutions. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Nashville, TN, USA, 2021; pp. 4729–4738. [Google Scholar] [CrossRef]
- Li, S.; Li, W.; Cook, C.; Gao, Y. Deep Independently Recurrent Neural Network (IndRNN), 2020, [arxiv:cs/1910. 0 6251.
- Zhang, P.; Lan, C.; Xing, J.; Zeng, W.; Xue, J.; Zheng, N. View Adaptive Neural Networks for High Performance Skeleton-Based Human Action Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 2019, 41, 1963–1978. [Google Scholar] [CrossRef]
- Song, Y.F.; Zhang, Z.; Shan, C.; Wang, L. Constructing Stronger and Faster Baselines for Skeleton-Based Action Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 2023, 45, 1474–1488. [Google Scholar] [CrossRef]
- Zang, T.; Zhu, Y.; Zhu, J.; Xu, Y.; Liu, H. MPAN: Multi-parallel Attention Network for Session-Based Recommendation. Neurocomputing 2022, 471, 230–241. [Google Scholar] [CrossRef]
- Guo, S.; Lin, Y.; Wan, H.; Li, X.; Cong, G. Learning Dynamics and Heterogeneity of Spatial-Temporal Graph Data for Traffic Forecasting. IEEE Transactions on Knowledge and Data Engineering 2022, 34, 5415–5428. [Google Scholar] [CrossRef]
- Zhang, H.; Ma, C.; Yu, D.; Guan, L.; Wang, D.; Hu, Z.; Liu, X. MTSCANet: Multi Temporal Resolution Temporal Semantic Context Aggregation Network. IET Computer Vision 2023, 17, 366–378. [Google Scholar] [CrossRef]
- Wu, C.; Yin, S.; Qi, W.; Wang, X.; Tang, Z.; Duan, N. Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models, 2023, [arxiv:cs/2303. 0 4671.
- Fernández, D.G.; Barrio, A.A.D.; Juan, G.B.; García, C.; Prieto, M.; Hermida, R. Complexity Reduction in the HEVC/H265 Standard Based on Smooth Region Classification. Digital Signal Processing 2018, 73, 24–39. [Google Scholar] [CrossRef]
- Marathe, A.P. Towards Intelligent Database Systems Using Clusters of SQL Transactions. Knowledge and Information Systems 2023, 65, 2863–2894. [Google Scholar] [CrossRef]
- Significant-Gravitas. AutoGPT, 2023.








| Application | Literature | AI Task | Data Attribute |
|---|---|---|---|
| Neurobehavioral Assessment |
[54] | Semantic Segmentation, Key Point Detection | SV-T |
| [52] | Image Classification | SV-T | |
| [55] | Image Classification | SV-T | |
| [10] | Semantic Segmentation, Action Recognition | SV-T | |
| [56] | Semantic Segmentation, Image Classification | SV-F | |
| [53] | Action Prediction | SV-F | |
| [57] | Semantic Segmentation | SV-F | |
| Social Behavior Analysis |
[39] | Object Detection, Action Recognition | SV-S |
| [47] | Pose Estimation, Action Recognition | MV-TFS | |
| [46] | Pose Estimation, Action Recognition Object Tracing |
SV-T | |
| [40] | Action Recognition, Key Point Detection | MV-TS | |
| [48] | Action Recognition | MV-TS | |
| [38] | Object Detection, Pose Estimation, Action recognition |
MV-TF | |
| External Stimuli Effective Assessment |
[25] | Key Point Detection, Action Recognition | MV-B |
| [26] | Pose Estimation | SV-B | |
| [27] | Object Detection, Semantic Segmentation, Image Classification |
SV-F | |
| [28] | Action Recognition | SV-T | |
| [29] | Instance Segmentation, Key Point Detection, Object Tracing, Action Recognition |
SV-T | |
| [30] | Action Recognition | SV-T | |
| [31] | Object Detection, Action Recognition | SV-F | |
| [32] | Object Tracing, Action Recognition | SV-T | |
| Disease Detection |
[18] | Semantic Segmentation, Action Recognition | SV-B |
| [16] | Semantic Segmentation, Action Recognition | SV-T | |
| [19] | Key Point Detection, Pose Estimation | MV-BS | |
| [20] | Key Point Detection, Pose Estimation | SV-S |
| Architecture | Type | Category | Dataset | Performance | |
|---|---|---|---|---|---|
| [27] | YoloV3 | Mice | One-stage | Open Images dataset | A mean intersection over union (IoU) score of 0.87 |
| [31] | Inspection ResNetV2 with Faster R-CNN | Mice | Two-stage | Private | Approximately 95% accuracy |
| [38] | Inspection ResNetV2 with ImageNet pretrained weights | Mice | Two-stage | Behavior Ensemble and Neural Trajectory Observatory (BENTO) | Good efficiency on Precision-Recall (PR) curves |
| [72] | Point Cloud Voxelization, 3D Feature Extractor, backbone(AFDet) and the Anchor-Free Detector | Object detection from point clouds | One-stage, anchor-free | Waymo Open Dataset, nuScenes Dataset | Accuracy:73.12, latency:60.06ms |
| [73] | YOLOv5, the feature fusion layer, and the multiscale detection layer | Industrial defect detection | Two-stage, anchor-based | VOC2007, NEU-DET, Enriched-NEU-DET | 83.3% mean average precision (mAP) |
| [74] | The location prior network (LPN) and the size prior network (SPN) | Video object detection | One-stage | ImageNet VID | 54.1 AP and 60.1 APl |
| [75] | ResNet backbone, a FPN, an ARM cascade network with rotated IoU prediction branch, and the two-stage sample selective strategy | Rotating object detection | Two-stage | UAV-ROD | 96.65 mAP and 98.84 accurancy under the plane category |
| Reference | Architecture | Type | Category | Dataset | Performance |
|---|---|---|---|---|---|
| [54] | CNN | Mice | 2D | Private | Achieve the recognition rate of 94.89% |
| [25] | ResNet-50 | Mice | 2D | Private | Reveal strain differences in both response timing and amplitude |
| [19] | ResNet-50 | Mice | 2D | Private | A 98% accuracy when compared baseline to animals at 3 dpi |
| [46] | ResNet-50 | Mice | 2D | Private | An accuracy of 86.7% |
| [20] | ResNet-50 | Mice | 2D | Private | Predict the acute injury status with 90% accuracy and long-term defcits with 85% accuracy. |
| [98] | SHNet, MaskedNet | Human | multi-person | MPII, COCO2017 | Achieve high accuracy on all 16 joint points |
| [99] | AlphaPose | Human | multi-person | Private, Halpe-FullBody136 | Detection precision is improved by 5.6%, and the false detection rate is reduced by 13% |
| [100] | LMANet | Human | single-person | Private, MPII, AI Challenger | PCKh value is 83.0935 |
| [101] | PGNet | Human | single-person | COCO | Improve the accuracy of the COCO dataset by 0.2% |
| Reference | Architecture | Type | Category | Dataset | Performance |
|---|---|---|---|---|---|
| [126] | RNN with LSTM | Mice | long-term | COCO, MPII | PCKh value is 92.3 in MPII and AP value is 75.5 in COCO |
| [53] | hidden Markov model (HVV) | Mice | long-term | Private, JHuang’s datasets | Achieve weighted average accuracy of 96.5% (using visual and context features) and 97.9% (incor porated with IDT and TDD features) |
| [137] | MultiParallel Attention Network (MPAN) | Recommendation | short-term | YOOCHOSE and DIGENTICA | Obtain the best ISLF |
| [138] | Spatial-Temporal Graph Neural Network (ASTGNN) | Traffic forecasting | long-term | Caltrans Perfor mance Measurement System (PeMS) | Get the best performance in MAE, RMSE and MAPE. |
| [139] | Multi-temporal resolution pyramid structure model (MTSCANet) | Videos | temporal semantic context | THUMOS14, ActivityNet-1.3, HACS | An average mAP of 47.02% on THUMOS14, an average mAP of 34.94% on ActivityNet-1.3 and an average mAP of 28.46% on HACS |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).