While large language models (LLM) have demonstrated significant advances in natural language processing, complex mathematical reasoning remains a challenging task, often revealing their limitations in multi-stage calculations and logical consistency. Multi-agent systems have become a promising paradigm for overcoming these limitations by distributing cognitive tasks between interacting agents, reflecting the dynamics of human problem solving. This paper provides a comparative review of the literature on nineteen different multi-agent architectures for solving mathematical problems. Our main research question is: "How do various LLM-based multi-agent architectures enable or improve mathematical problems, and what are their comparative advantages, limitations, and design trade-offs?" Through a systematic analysis of the roles of agents, interaction mechanisms, and training methods, we have identified several key findings. We observe the evolution of architecture from unstructured debate-based systems to more efficient hierarchical and self-optimizing frameworks. We highlight persistent problems that hinder progress, including agent homogeneity, when agents working on the same LLM cannot generate truly diverse reasoning, and the problem of "lazy agents", when some agents contribute minimal to consistent collaboration. This review contributes to a structured understanding of the current situation and lays the foundation for future research aimed at developing more reliable, efficient, and complex multi-agent reasoning systems.