While Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs), it also presents challenges that can affect model accuracy and performance. Practical applications show that RAG can mask the intrinsic capabilities of LLMs. Firstly, LLMs may become overly dependent on external retrieval, underutilizing their own knowledge and inference abilities, which can reduce responsiveness. Secondly, RAG techniques might introduce irrelevant or low-quality information, adding noise to the LLM. This can disrupt the normal generation process, leading to inefficient and low-quality content, especially when dealing with complex problems. This paper proposes a RAG framework that uses reflective tags to control retrieval. This framework evaluates retrieved documents in parallel and incorporates the Chain of Thought (CoT) technique for step-by-step content generation. The model selects the highest quality and most accurate content for final generation. The main contributions include: 1) Reducing the hallucination problem by selectively utilizing high-scoring document, 2) Enhancing real-time performance through timely external database retrieval, and 3) Minimizing negative impacts by filtering out irrelevant or unreliable information through parallel content generation and reflective tagging. These advancements aim to optimize the integration of retrieval mechanisms with LLMs, ensuring high-quality and reliable outputs.