Precise quantification of fine motor behavior is essential for understanding neural circuit function and evaluating therapeutic interventions in neurological disorders. While markerless pose estimation frameworks such as DeepLabCut (DLC) have transformed behavioral phenotyping, the choice of convolutional neural network (CNN) backbone significantly impacts tracking performance, particularly for tasks involving small distal joints and partial occlusions. in this paper, we present the first systematic comparison of nine CNN architectures implemented in DLC for lateral-view analysis of fine reaching movements in the Montoya Staircase test, a gold standard assay for skilled forelimb co-ordination in rodent models of stroke and neurodegenerative disease. Using a dataset of videos representing both control and M1-lesioned conditions, we rigorously evaluated models across six critical dimensions: spatial accuracy (RMSE, PCK@5px), mean average precision (mAP), occlusion robustness, inference speed and GPU memory usage. Our results reveal that multi-scale DLCRNet architectures substantially outperformed classical backbones, with DLCRNet_ms5 achieving the highest overall accuracy and DLCRNet_stride16_ms5 providing the best trade-off between precision and computational efficiency. These findings provide critical methodological guidance for neuroscience la-boratories and highlight the importance of architecture selection for rigorous quantification of fine motor behavior in preclinical research.