In recent years, neural networks based on attention mechanisms have been increasingly widely used in speech recognition, separation, enhancement, and other fields. In particular, the convolution-augmented transformer has achieved good performance as it can combine the advantages of convolution and self-attention. Recently, the gated attention unit (GAU) has been proposed. Compared with the traditional multi-head self-attention, approaches with GAU are effective and computationally efficient. In this article, we propose a network for speech enhancement called CGA-MGAN, a kind of Metric GAN based on convolution-augmented gated attention. CGA-MGAN captures local and global correlations in speech signals at the same time through the fusion of convolution and gated attention units. Experiments on Voice Bank + DEMAND show that the CGA-MGAN we propose achieves an excellent performance (3.47 PESQ, 0.96 STOI, and 11.09dB SSNR) at a relatively small model size (1.14M).