Working Paper Review Version 3 This version is not peer-reviewed

Automatic Segmentation of White Matter Hyperintensities from Brain Magnetic Resonance Images in the Era of Deep Learning and Big Data – A Systematic Review

Version 1 : Received: 8 September 2020 / Approved: 9 September 2020 / Online: 9 September 2020 (11:42:16 CEST)
Version 2 : Received: 18 November 2020 / Approved: 20 November 2020 / Online: 20 November 2020 (11:16:44 CET)
Version 3 : Received: 20 November 2020 / Approved: 20 November 2020 / Online: 20 November 2020 (13:44:46 CET)

A peer-reviewed article of this Preprint also exists.

Journal reference: Computerized Medical Imaging and Graphics 2021, 88, 101867
DOI: 10.1016/j.compmedimag.2021.101867


Background: White matter hyperintensities (WMH), of presumed vascular origin, are visible and quantifiable neuroradiological markers of brain parenchymal change. These changes may range from damage secondary to inflammation and other neurological conditions, through to healthy ageing. Fully automatic WMH quantification methods are promising, but still, traditional semi-automatic methods seem to be preferred in clinical research. We systematically reviewed the literature for fully automatic methods developed in the last five years, to assess what are considered state-of-the-art techniques, as well as trends in the analysis of WMH of presumed vascular origin. Method: We registered the systematic review protocol with the International Prospective Register of Systematic Reviews (PROSPERO), registration number - CRD42019132200. We conducted the search for fully automatic methods developed from 2015 to July 2020 on Medline, Science direct, IEE Explore, and Web of Science. We assessed risk of bias and applicability of the studies using QUADAS 2. Results: The search yielded 2327 papers after removing 104 duplicates. After screening titles, abstracts and full text, 37 were selected for detailed analysis. Of these, 16 proposed a supervised segmentation method, 10 proposed an unsupervised segmentation method, and 11 proposed a deep learning segmentation method. Average DSC values ranged from 0.538 to 0.91, being the highest value obtained from an unsupervised segmentation method. Only four studies validated their method in longitudinal samples, and eight performed an additional validation using clinical parameters. Only 8/37 studies made available their method in public repositories. Conclusions: We found no evidence that favours deep learning methods over the more established k-NN, linear regression and unsupervised methods in this task. Data and code availability, bias in study design and ground truth generation influence the wider validation and applicability of these methods in clinical research.


White matter lesions; white matter hyperintensities; supervised segmentation; unsupervised segmentation; deep learning; FLAIR hyperintensities



Comments (1)

Comment 1
Received: 20 November 2020
Commenter: Maria Del C. Valdés Hernández
Commenter's Conflict of Interests: Author
Comment: Improved figure quality in pdf version. Abstract figure added. Attended to oversight related to previous amendment in a sentence in the subsection 3.9. Methods evaluation: it is not incorrect to state that 30/37 studies evaluated the performance of the proposed segmentation method using the Dice Similarity Coefficient (DSC) amongst other measurments of spatial agreement, as Schirmer et al used the DSC to evaluate the brain extraction/alignment within their segmentation pipeline/scheme. But to avoid misinterpretations we now write: "Of the 37 studies, 29 studies evaluated the performance of their WMH segmentation method using the Dice Similarity Coefficient (DSC) among other metrics that measure spatial concordance between the results of the method proposed and reference segmentations."
+ Respond to this comment

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 1
Metrics 0

Notify me about updates to this article or when a peer-reviewed version is published.

We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.