Credit risk assessment requires both accurate prediction and structured decomposition of how hetero-geneous evidence contributes to each decision. Monolithic Large Language Models can incorporate unstructured evidence and natural-language reasoning into such workflows, but in high-stakes un- derwriting they may be distracted by noisy inputs, miss rare but decisive risk cues, and offer limited control over policy-dependent decision thresholds. We present CREDITAGENT, a hierarchical credit review system with three stages: evidence filtering, specialist risk analysis by agents, and decision fusion. Our central contribution is holding the adapted backbone, specialist-agent outputs, hard-stop rules, and data split fixed, we vary only the final fusion strategy to isolate the effect of hierarchical fusion on underwriting quality. On a held-out set of 6,000 personal credit cases from Chinese financial institution, CREDITAGENT achieves 83.32% accuracy and a Business Efficiency Coefficient of 0.7647 outperform flagship model. We present these findings as an institution-specific case study while identifying which components (hierarchical fusion, GRPO training recipe) are mechanism-portable versus institution-specific (hard-stop rules, cost ratios). To ensure reproducibility, we make code and dataset publicly available at
https://github.com/kouzhizhuo/Credit_Agents.