Precision of Manual vs. Automated Corpus Callosum Atrophy Measurements in Multiple Sclerosis
Michael Platten1,2, Katarina Fink1,3, Juha Martola1, and Tobias Granberg1,2

1Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden, 2Department of Radiology, Division of Neuroradiology, Karolinska University Hospital, Stockholm, Sweden, 3Department of Neurology, Karolinska University Hospital, Stockholm, Sweden


Corpus callosum atrophy is a favorable imaging biomarker in MS. Manual measurements of the corpus callosum are considered the best current standard, but their repeatability and reproducibility are uncertain. FreeSurfer is an automatic software that can volumetrically measure the corpus callosum. Using a representative cohort of 9 MS patients, scanned twice with repositioning on 3 different MRI scanners, we compared the manual and automatic measurements of corpus callosum. We found the longitudinal FreeSurfer method to be the most precise method. Thus, we recommend that this method be used in future studies of measuring corpus callosum atrophy in patients with MS.


The corpus callosum consists of a bundle of commissural fibers enabling communication between the two cerebral hemispheres. In Multiple Sclerosis (MS), the corpus callosum is affected by both focal lesions and Wallerian degeneration from distant pathology.1 This is in contrast to normal aging where corpus callosum is relatively resistant to change.2,3 It is thus a strategic anatomical structure to evaluate in MS-patients.

Manual measurements of the corpus callosum have high intra- and inter-rater agreement and is considered the best current standard.4 However, it is unknown how repeatable and reproducible these measurements are within and between different MRI scanners. Meanwhile, the rapid development of automatic segmentation may provide more efficient and precise alternatives. Figure 1 illustrates manual 2D and automatic 3D measurements of the corpus callosum.

FreeSurfer 6.0.0 (Harvard University, Boston, MA, USA) is a fully automatic software that can volumetrically measure brain structures, including the corpus callosum. A relatively new feature of FreeSurfer allows longitudinal comparisons for each patient, thus potentially making the comparison more precise.5

We aim to compare the precision of manual and automated measurements of the corpus callosum with both cross-sectional and longitudinal FreeSurfer analysis, in order to determine the most precise method for measuring corpus callosum atrophy.


To study the precision of corpus callosum measurements, a representative cohort of 9 MS patients, representing all subtypes, were scanned twice with repositioning using a 3D T1-weighted sequence on 3 different MRI scanners (Siemens Aera 1.5 T, Avanto 1.5 T, Trio 3.0 T), a total of 6 scans per patient, on the same day. The cohort demographics are summarized in Table 1 and Figure 2 demonstrates 6 scans of the same patient.

For every scan, the normalized corpus callosum area was calculated by measuring the corpus callosum in the mid-sagittal slice and dividing it by the intracranial area (Figure 1). This was completed in a blinded fashion by a trained rater (M.P., medical student). The coefficient of variation was used to study the intra- and inter-scanner agreement and paired t-tests were used to compare the manual method, cross-sectional and longitudinal FreeSurfer measurements. We also explored the correlation between each method and the patients’ clinical disability in the form of normalized Symbol Digit Modalities Test (SDMT) and Expanded Disability Status Scale (EDSS).


Freesurfer's longitudinal method was more precise compared with the manual method (intra-scanner P=0.0018, inter-scanner P=0.030) and the cross-sectional method (intra-scanner P=0.0050, trend for inter-scanner P=0.070), both within and between scanners, as shown in Table 2. Meanwhile, there was no significant difference in the coefficient of variation between the manual measurements and Freesurfer's cross-sectional method (intra-scanner P=0.26, inter-scanner P=0.81).

Table 3 shows the correlations between the different measuring methods and the patients’ clinical disability as measured by normalized SDMT and EDSS. FreeSurfer’s longitudinal method yielded highly significant correlations with both normalized SDMT and EDSS. FreeSurfer’s cross-sectional method yielded no significant correlations and nCCA yielded a counter-intuitive inverse relationship to normalized SDMT and no correlation to EDSS.


The aim of this study was to compare the precision of manual and automatic corpus callosum measurements. We show that FreeSurfer’s longitudinal method is significantly more precise than both FreeSurfer’s cross-sectional method as well as the current standard manual method. We also show that FreeSurfer’s longitudinal method has the strongest correlations with the patients’ disability status as measured by normalized SDMT and EDSS. These correlations confirm previous findings that atrophy correlates well with physical and cognitive disability in MS.6–8 Corpus callosum measurements are considered powerful biomarkers of disease progression, and corpus callosum damage has been shown to predict progression of disability in primary progressive MS.9

It is important to continuously evaluate the precision of new segmentation methods, in order to ensure the use of the most ideal software. Heinen et al. showed that the robustness of FreeSurfer, along with other segmentation tools, across varying magnetic field strengths differs between anatomical structures.10 Furthermore, Germeyan et al. noted that even though FreeSurfer overestimated hippocampus size, its detection of atrophy rates was comparable to that of manual raters.11 From our study, we can see that FreeSurfer longitudinal method has significantly better correlations to disability status and a significantly higher precision, and this is particularly important when following the same patient over time. This is central in MS, where the chronic disease and treatment response is followed longitudinally.


This study establishes that the precision of FreeSurfer’s longitudinal method is superior to the current standard of manual measurements. Thus, we recommend that subsequent studies use this tool for measuring and monitoring corpus callosum atrophy in patients with MS.


We would like to thank all participating patients as well as all the staff that made this study possible. This research was supported by the Stockholm City Council and Karolinska Institutet (ALF 20120213 and 20150166). Dr. Granberg was supported by the Swedish Society for Medical Research.


1. Evangelou N, Konz D, Esiri MM, Smith S, Palace J, Matthews PM. Regional axonal loss in the corpus callosum correlates with cerebral white matter lesion volume and distribution in multiple sclerosis. Brain. 2000;123:1845–1849.

2. Mitchell TN, Free SL, Merschhemke M, Lemieux L, Sisodiya SM, Shorvon SD. Reliable callosal measurement: population normative data confirm sex-related differences. AJNR Am J Neuroradiol. 2003;24:410–418.

3. Martola J, Stawiarz L, Fredrikson S, et al. Progression of non-age-related callosal brain atrophy in multiple sclerosis: a 9-year longitudinal MRI study representing four decades of disease development. J Neurol Neurosurg Psychiatry. 2007;78:375–380.

4. Granberg T, Bergendal G, Shams S, et al. MRI-Defined Corpus Callosal Atrophy in Multiple Sclerosis: A Comparison of Volumetric Measurements, Corpus Callosum Area and Index. J Neuroimaging. 2015;25:996–1001.

5. Reuter M, Schmansky NJ, Rosas HD, Fischl B. Within-subject template estimation for unbiased longitudinal image analysis. Neuroimage. 2012;61:1402–1418.

6. Granberg T, Martola J, Bergendal G, et al. Corpus callosum atrophy is strongly associated with cognitive impairment in multiple sclerosis: Results of a 17-year longitudinal study. Mult Scler J. 2015;21:1151–1158.

7. Yaldizli Ö, Penner I-K, Frontzek K, et al. The relationship between total and regional corpus callosum atrophy, cognitive impairment and fatigue in multiple sclerosis patients. Mult Scler Houndmills Basingstoke Engl. 2014;20:356–364.

8. Bergendal G, Martola J, Stawiarz L, Kristoffersen-Wiberg M, Fredrikson S, Almkvist O. Callosal atrophy in multiple sclerosis is related to cognitive speed. Acta Neurol Scand. 2013;127:281–289.

9. Bodini B, Cercignani M, Khaleeli Z, et al. Corpus callosum damage predicts disability progression and cognitive dysfunction in primary-progressive MS after five years. Hum Brain Mapp. 2013;34:1163–1172.

10. Heinen R, Bouvy WH, Mendrik AM, Viergever MA, Biessels GJ, de Bresser J. Robustness of Automated Methods for Brain Volume Measurements across Different MRI Field Strengths. PLoS ONE [online serial]. 2016;11. Accessed at: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5087903/. Accessed October 29, 2017.

11. Germeyan SC, Kalikhman D, Jones L, Theodore WH. Automated Versus Manual Hippocampal Segmentation in Pre- and Postoperative Epilepsy Patients. Epilepsia. 2014;55:1374–1379.


Figure 1. Manual (above) and automated FreeSurfer (below) measurements of corpus callosal atropy.

Figure 2. MRI scans of a patient. The same 48-year-old female MS patient scanned 6 times, with repositioning, using three different scanners; Aera (1.5T), Avanto (1.5T) and Trio (3T).

Table 1. Cohort Demographics (N=9)

Table 2. Repeatability and reproducibility of the corpus callosum measurements. The Coefficient of Variation for each method is compared using paired t-tests.

Table 3. Correlations of the corpus callosum measurement and cognitive and physical disability. All nine patients are scanned by three different scanners twice, for a total of 54 scans which were then correlated to the patients’ disability status.

Proc. Intl. Soc. Mag. Reson. Med. 26 (2018)