This paper examines the empirical evidence on the use of generative artificial intelligence (GenAI) in scientific writing. A search was conducted in Google Scholar and PubMed, followed by an analysis of the included studies, which was performed according to the academic field, AI tool, writing task, study design, and main findings. Following the PRISMA guide, this scoping review included 18 studies published between 1st January 2023 and 1st January 2026, representing the disciplines of medicine, education, dentistry, radiology, humanities, library, information science and cognitive science. The evidence base was dominated by studies on ChatGPT, making it the most empirically researched GenAI tool in this field. According to the studies reviewed, GenAI performed well on an array of measures (readability, fluency, and organization) and efficiency (the latter especially in terms of manuscript drafting, abstract writing, proposal development, and literature reviewing). However, the findings also disclosed several limitations, including incorrect or falsified references, inaccurate bibliographical metadata, shallow analysis, lack of originality, and insufficient methodological depth. Based on comparative evidence, newer model versions show improved coherence and reasoning and although improved with the newer GenAI versions, reference reliability still appears to be a recurring problem. Overall, GenAI can be a useful assistive tool for scientific writing; however, its usefulness is dependent upon human supervision and the task at hand, especially with regard to the accuracy of facts and their sources.