A Generic Supervised Learning Framework for Fast Brain Extraction
Yuan Liu1,2, Benjamin Odry2, Hasan Ertan Cetingul2, and Mariappan Nadar2

1Vanderbilt Institute in Surgery and Engineering, Vanderbilt University, Nashville, TN, United States, 2Medical Imaging Technologies, Siemens Healthcare, Princeton, NJ, United States


Automatic brain extraction, as a standard pre-processing step, typically suffers from a long runtime and inaccuracies caused by brain variations and limited qualities of MR images. We propose a generic supervised learning framework that builds binary classifiers to identify brain and non-brain tissues at different resolution levels, hierarchically performs voxel-wise classifications for a test subject, and refines the brain boundary using narrow-band level set technique on the classification map. The proposed method is evaluated on multiple datasets with different acquisition sequences and scanner types using uni- or multi-contrast images and shown to be fast, accurate, and robust.


Brain extraction is a standard preprocessing step for subsequent tasks such as bias field correction, tissue segmentation, and cortical surface reconstruction. This process is challenging to fully automate, due to the anatomical variations of the brain and the imperfections in MR images. Existing approaches could be categorized into boundary-based1, region-based2, atlas-based3, learning-based4, and hybrid5 methods, all of which have strengths and weaknesses (e.g., sensitivity to noise for boundary/region-based, computational costs for atlas-based, feature/classifier design for learning-based methods, etc.) We herein present a generic supervised learning framework for fast, accurate, and robust brain extraction with extendibility to multi-contrast data.


Our framework is based on multi-resolution classification on uni/multimodal MR data. First, standard preprocessing steps are applied to normalize the intensity distributions for the input data and affinely register it to a preselected template. Next, the training stage is triggered to build binary classifiers at four spatial resolution levels (i.e., the original images and the ones downsampled by 2, 4, and 8), which involves data sampling, feature extraction, and random forests learning for each level. Specifically, a sampling region is estimated by morphologically processing the ground truth mask to highlight a narrow band along the boundaries. For each randomly drawn sample from this region, the conventional spatial features, multi-scale intensity contextual features6, and spatial prior features (i.e., contextual features computed on the averaged group mask at the coarsest level) are extracted, followed by the use of random forests7 as binary classifiers. For an unseen image, starting with the average group mask at the coarsest level, we hierarchically refine the boundary by classifying voxels in a fixed-size narrow band along previous estimations. At the finest level, we couple the voxelwise classification with a narrow band level set approach8 using Chan-Vese9 region force to dynamically determine the test voxels. As the front propagates, the narrow band shifts accordingly and classification scores are computed only for the newly appeared voxels. This allows the surface to recover from previous mistakes without examining a large search region. We also induce a curvature term in the front propagation as regularization to maintain a smooth closed surface.


We evaluate our framework on multiple datasets with different acquisition sequences and scanner types using single or multi-contrast images. We first consider the LONI-LPBA4010 dataset, which comprises 40 normal subjects each with a 1.5T T1-weighted image and a manually delineated brain mask. We randomly split the dataset into 20 for training and the other 20 for testing. The worst, an average, and the best results are illustrated in Figure 1, together with those using BET, a publicly available tool in FSL. Our segmentations are accurate while BET consistently oversegments the brain near the hippocampus. This observation is reflected in Figure 2 for all test subjects, as our method yields higher Dice, Jaccard, and specificity values, while BET achieves higher sensitivity due to oversegmentation. The same classifiers are then tested on a traumatic brain injury dataset of 10 subjects, each with a 3T T1-weighted image, and acceptable results are obtained (Figure 3). Additional tests are performed on the ATAG11 dataset of 53 normal subjects, each with a 7T T1-weighted image. We randomly select 20 subjects to retrain the classifiers and test on the remaining 33. As there is no manual annotation, we use the brain masks produced by the CBS12 tools for MIPAV as pseudo ground truth. We again obtain results that are better than the ones produced (whenever reasonable) by the CBS tools (Figure 4). In our final experiment, we consider a subset of the HCP13 datasets, which contains 50 normal subjects of 3T T1-weighted and T2-weighted images. To adapt our algorithm for multi-modal images, we extract the intensity contextual features from each co-registered image sequence independently and concatenate them to form the entire feature set. Considering the brain masks computed by the HCP processing pipeline as a pseudo ground truth, we retrain our models using randomly selected 20 cases and evaluate those models on the remaining 30. With a mean Dice coefficient of 0.98 and Jaccard index of 0.96, our multi-contrast framework achieves high accuracy, with a typical case shown in Figure 5.


We present a generic framework for MR brain extraction using hierarchical voxelwise classification coupled with narrow band level set. Through a series of experiments, we demonstrate that our method is accurate and robust across various scanners at different strength of magnetic field with natural extendibility for multi-contrast data. It is also very fast, taking only 30 seconds to strip the skull on a standard preprocessed image.


Yuan Liu performed this work while at Siemens Healthcare.


1. Smith S. Fast robust automated brain extraction. Hum Brain Mapp. 2002; 17(3):143-55.

2. Shattuck D, Sandor-Leahy S, Schaper K, et al. Magnetic resonance image tissue classification using a partial volume model. Neuroimage. 2001; 13(5):856-76.

3. Heckemann R, Ledig C, Gray K, et al. Brain extraction using label propagation and group agreement: Pincram. PLoS One. 2015; 10(7):e0129211.

4. Iglesias J, Liu C, Thompson P, et al. Robust brain extraction across datasets and comparison with publicly available methods. IEEE Trans Med Imaging. 2011; 30(9):1617-34.

5. Shi F, Wang L, Dai Y, et al. LABEL: pediatric brain extraction using learning-based meta-algorithm. Neuroimage. 2012; 62(3):1975-86.

6. Pauly O, Glocker B, Criminisi A, et al. Fast multiple organ detection and localization in whole-body MR Dixon sequences. In: MICCAI. 2011; 239-47.

7. Breiman L. Random forests. Mach Learn. 2001; 45(1):5-32.

8. Adalsteinsson D, Sethian J. A fast level set method for propagating interfaces. J Comput Phys. 1995; 118(2):269-77.

9. Chan T, Vese L. Active contours without edges. IEEE Trans Image Process. 2001; 10(2):266-77.

10. Shattuck D, Mirza M, Adisetiyo V, et al. Construction of a 3D probabilistic atlas of human cortical structures. Neuroimage. 2008; 39(3):1064-80.

11. Forstmann B, Keuken M, Schafer A, et al. Multi-modal ultra-high resolution structural 7-Tesla MRI data repository. Sci Data. 2014; 1:140050.

12. Bazin P, Weiss M, Dinse J, et al. A computational framework for ultra-high resolution cortical segmentation at 7 Tesla, Neuroimage. 2014; 93(2):201-9.

13. Van Essen DC, Ugurbil K, Auerbach E, et al. The human connectome project: a data acquisition perspective. Neuroimage. 2012; 62(4):2222-31.


Figure 1: The worst (top), an average (middle), and the best (bottom) results using our method for LPBA40 dataset, with manual annotations as red contours, ours green, and BET results pink. Right panels are our resulting surfaces color-coded by surface errors, with blue indicating small errors and red being large.

Figure 2: Segmentation accuracy for LPBA40 dataset using the proposed method and BET, measured by Dice coefficient, Jaccard index, sensitivity, and specificity.

Figure 3: The worst (top), an average (middle), and the best (bottom) results using the proposed method for traumatic brain injury dataset, with our results shown in green and the BET results in pink.

Figure 4: A typical case for brain extraction in 7T using our algorithm. Pseudo ground truth brain mask is shown as red contours and ours as green contours.

Figure 5: A typical case for multi-contrast brain extraction using our algorithm based on T1 (top) and T2 (bottom) images. Pseudo ground truth brain mask is shown as red contours and ours as green contours.

Proc. Intl. Soc. Mag. Reson. Med. 24 (2016)