Preprint Review Version 1 Preserved in Portico This version is not peer-reviewed

Automatic Segmentation of White Matter Hyperintensities from Brain Magnetic Resonance Images in the Era of Deep Learning and Big Data – A Systematic Review

Version 1 : Received: 8 September 2020 / Approved: 9 September 2020 / Online: 9 September 2020 (11:42:16 CEST)
Version 2 : Received: 18 November 2020 / Approved: 20 November 2020 / Online: 20 November 2020 (11:16:44 CET)
Version 3 : Received: 20 November 2020 / Approved: 20 November 2020 / Online: 20 November 2020 (13:44:46 CET)

A peer-reviewed article of this Preprint also exists.

Journal reference: Computerized Medical Imaging and Graphics 2021, 88, 101867
DOI: 10.1016/j.compmedimag.2021.101867


Background: White matter hyperintensities (WMH), of presumed vascular origin, are visible and quantifiable neuroradiological markers of brain parenchymal change. These changes may range from damage secondary to inflammation and other neurological conditions, through to healthy ageing. Fully automatic WMH quantification methods are promising, but still, traditional semi-automatic methods seem to be preferred in clinical research. We systematically reviewed the literature for fully automatic methods developed in the last five years, to assess what are considered state-of-the-art techniques, as well as trends in the analysis of WMH of presumed vascular origin. Method: We registered the systematic review protocol with the International Prospective Register of Systematic Reviews (PROSPERO), registration number - CRD42019132200. We conducted the search for fully automatic methods developed from 2015 to July 2020 on Medline, Science direct, IEE Explore, and Web of Science. We assessed risk of bias and applicability of the studies using QUADAS 2. Results: The search yielded 2327 papers after removing 104 duplicates. After screening titles, abstracts and full text, 37 were selected for detailed analysis. Of these, 16 proposed a supervised segmentation method, 10 proposed an unsupervised segmentation method, and 11 proposed a deep learning segmentation method. Average DSC values ranged from 0.538 to 0.93, being the highest value obtained from a deep learning segmentation method. Only four studies validated their method in longitudinal samples, and eight performed an additional validation using clinical parameters. Only 8/37 studies made available their method in public repositories. Conclusions: Although deep learning methods reported highly accurate results, we found no evidence that favours them over the more established k-NN, linear regression and unsupervised methods in this task. Data and code availability, bias in study design and ground truth generation influence the wider validation and applicability of these methods in clinical research.


White matter lesions; white matter hyperintensities; supervised segmentation; unsupervised segmentation; deep learning; FLAIR hyperintensities

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 0
Metrics 0

Notify me about updates to this article or when a peer-reviewed version is published.

We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.