Submitted:
05 October 2025
Posted:
08 October 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Work
3. Methodology
4. Algorithm and Model
Meta-Cognitive Scheduler
Symbolic Parser Module
Semantic Grounding Module
Theorem Engine Module
Answer Synthesizer Module
Symbolic-Neural Computation Cell
Training Details
5. Hierarchical Training and Verification Enhancements
5.1. Hierarchical Curriculum Fine-Tuning Strategy
6. Multi-Level Self-Verification Mechanism
6.1. Token-Level Verification
6.2. Expression-Level Verification
6.3. Self-Consistency Inference Strategy
6.4. Total Enhanced Objective
6.5. Evaluation Metrics
6.5.1. Accuracy
6.5.2. Self-Consistency Rate (SCR)
6.5.3. Logical Consistency Score (LCS)
6.5.4. Formula Reconstruction Accuracy (FRA)
7. Experiment Results
7.1. Ablation Study Findings
8. Conclusion
References
- Pan, L.; Albalak, A.; Wang, X.; Wang, W.Y. Logic-lm: Empowering large language models with symbolic solvers for faithful logical reasoning. arXiv 2023, arXiv:2305.12295. [Google Scholar] [CrossRef]
- Yu, Y.; Zhang, Y.; Zhang, D.; Liang, X.; Zhang, H.; Zhang, X.; Yang, Z.; Khademi, M.; Awadalla, H.; Wang, J.; et al. Chain-of-Reasoning: Towards Unified Mathematical Reasoning in Large Language Models via a Multi-Paradigm Perspective. arXiv 2025, arXiv:2501.11110. [Google Scholar]
- Hu, Y.; Yu, Y. Enhancing neural mathematical reasoning by abductive combination with symbolic library. arXiv 2022, arXiv:2203.14487. [Google Scholar] [CrossRef]
- Uesato, J.; Kushman, N.; Kumar, R.; Song, F.; Siegel, N.; Wang, L.; Creswell, A.; Irving, G.; Higgins, I. Solving math word problems with process-and outcome-based feedback. arXiv 2022, arXiv:2211.14275. [Google Scholar]
- Lightman, H.; Kosaraju, V.; Burda, Y.; Edwards, H.; Baker, B.; Lee, T.; Leike, J.; Schulman, J.; Sutskever, I.; Cobbe, K. Let’s verify step by step. In Proceedings of the The Twelfth International Conference on Learning Representations; 2023. [Google Scholar]
- Madaan, A.; Tandon, N.; Gupta, P.; Hallinan, S.; Gao, L.; Wiegreffe, S.; Alon, U.; Dziri, N.; Prabhumoye, S.; Yang, Y.; et al. Self-refine: Iterative refinement with self-feedback. Advances in Neural Information Processing Systems 2023, 36, 46534–46594. [Google Scholar]
- Trinh, T.; Luong, T. AlphaGeometry: An Olympiad-level AI system for geometry. Google DeepMind 2024, 17. [Google Scholar]
- Bandyopadhyay, D.; Bhattacharjee, S.; Ekbal, A. Thinking machines: A survey of llm based reasoning strategies. arXiv 2025, arXiv:2503.10814. [Google Scholar] [CrossRef]




| Model | Accuracy | SCR | LCS | FRA | Time (s) |
|---|---|---|---|---|---|
| MetaMath-LLaMA (Full) | 0.735 | 0.85 | 0.92 | 0.88 | 5.6 |
| MetaMath-LLaMA (w/o MCS) | 0.713 | 0.81 | 0.89 | 0.85 | 5.2 |
| MetaMath-LLaMA (w/o SPM) | 0.725 | 0.83 | 0.90 | 0.86 | 5.4 |
| MetaMath-LLaMA (w/o TEM) | 0.720 | 0.82 | 0.88 | 0.84 | 5.3 |
| MetaMath-LLaMA (w/o ASM) | 0.700 | 0.79 | 0.87 | 0.83 | 5.1 |
| MetaMath-LLaMA (w/o SNCC) | 0.710 | 0.80 | 0.85 | 0.82 | 5.0 |
| Baseline GPT-3.5 | 0.612 | 0.73 | 0.80 | 0.75 | 6.2 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).