Lifespan study by cross-sectional case-control comparisons in sliding age windows: test of ASD heterogeneity with One-Class Classifiers
Piernicola Oliva1, Alessia Giuliano2, Paolo Bosco2, Elisa Ferrari3, Michela Tosetti4, Filippo Muratori4, Calderoni Sara4, and Alessandra Retico2

1University of Sassari and National Institute for Nuclear Physics, Cagliari, Italy, 2National Institute for Nuclear Physics, Pisa, Italy, 3University of Pisa, Physics Department and National Institute for Nuclear Physics, Pisa, Italy, 4IRCCS Stella Maris Foundation, Pisa, Italy


Cross-sectional studies reported inconsistent findings on distinctive neuroanatomical characteristics of Autism Spectrum Disorders (ASD). We set up a lifespan study through a series of machine-learning-based case-control comparisons made on sub-cohorts obtained by partitioning a large structural MRI data sample (age range: 2-25 years) in subsamples with partially-overlapping narrower age ranges (3-4 years). We implemented One-Class Support Vector Machines on these sub-cohorts, obtaining the temporal evolution of the case-control separation ability, which is related to the detectability of neuroimaging-based biomarkers. Distinctive common features characterize children with ASD under 5 years of age; the heterogeneity of the ASD condition dominates from adolescence.


Neuroanatomical distinctive characteristics of subjects with Autism Spectrum Disorders (ASD) have been widely investigated with analyses of MRI data based on machine learning (ML) techniques. Cross-sectional studies reported inconsistent findings across the lifespan. Despite longitudinal studies are required to shed light in this field, we set up a series of cross-sectional case-control comparisons, partitioning a large data sample of structural MRI scans of subjects spanning from 2 to 25 years of age in subsamples with partially-overlapping narrower age ranges (3 or 4 years). We implemented One-Class Classifiers (OCC) based on Support Vector Machines (OCC-SVM)1,2. OCC, which are also known as data description methods, in contrast to the two-class classification methods are based on the description of one class of subjects only, referred as the positive class3. Then, new examples are tested for their similarity to the positive class, and eventually considered as outliers. The application of the OCC-SVM on features extracted from sMRI data of subjects affected by ASD and control subjects allows investigating the two-class separability. Relevant information about the corresponding detectability of neuroimaging-based biomarkers emerges. By reporting the classification performance, measured in terms of the area under the ROC curve (AUC), at the central value of each age window, the age trend of the case-control separation ability can be derived. Additionally, the analysis based on OCC-SVM reveals which of the two cohorts of subjects is more homogeneous in terms of the neuroanatomical profile, providing information on the impact of the heterogeneity of the ASD condition on ML-based analyses across the lifespan.


We considered the structural MRI scans of a sample of 768 male subjects (range 7–25 years of age) extracted from the ABIDE multi-site data collection4. To extend the age range of this sample to the early childhood, we added the structural MRI scans of 38 male children (range 2-7 years of age) collected at the IRCCS Stella Maris Foundation (SMF) in Pisa (Italy). Sample characteristics, including normal full intelligence quotient (FIQ)/non-verbal IQ (NVIQ) are reported in Fig. 1. For each data set, partially overlapping age windows are considered, with the specifications provided in Fig. 2. The parcellation of the cortex into 62 structures (DKT atlas) was performed with Freesurfer v6.0. Six features were computed for each structure: volume, mean and standard deviation of cortical thickness, mean curvature, surface area, curvature index. In addition, 8 volumetric and surface global features were included, together with the volumes of 39 subcortical regions, leading to 419 features for each subject.We applied the OCC-SVM classifiers to the vectors of features, using a non-linear kernel with Radial Basis Functions and setting the training parameter controlling the smoothness of the spherical boundary ν=0.1 and assigning the heuristic value to γ, which controls the number of training errors2. The leave-pair-out cross-validation procedure has been implemented. SVM analysis was developed with Matlab2016 by leveraging the interfaces to LIBSVM package and the Statistical Pattern Recognition Toolbox (STPRTool).


The age trend of the AUC values is reported in Fig 3, for OCC-SVM trained using either the control group (blue curve) or the population with ASD (red curve) as the positive class.We found that children with ASD under 5 years of age show common neuroanatomical features that allow distinguishing them from age- and NVIQ-matched controls. OCC-SVM trained on data of children with ASD achieve an AUC over 70%. Children with ASD between 5 and 13 years of age cannot be distinguished from controls, neither training of the ASD nor on the control sample. Finally, starting from the adolescence, the separability of the two class of subjects becomes apparent again, with an AUC in the 60-70% range over 18 years of age. However, the more homogeneous class enclosed in the OCC-SVM hypersphere is the control class in this age range.

Discussion and Conclusions

The age dependency of the discrimination capability of OCC-SVM classifiers has been shown. Interestingly, the age trends of the AUC obtained by training on the control class and on the ASD class intersect and remain on the chance level between 5 and 13 years of age. In the early childhood male subjects with ASD show a common neuroanatomical profile, that allows distinguishing them from controls, whereas after 13 years of age the heterogeneity of the ASD condition dominates.


This work has been supported by the Tuscany Government (PAR-FAS2007-2013, Bando FAS Salute 2014) through the ARIANNA Project (C52I16000020002) https://arianna.pi.infn.it


1. Mourão-Miranda, J., Bokde, A.L., Born, C.,Hampel, H., and Stetter, M.(2005). Classifying brain states and determining the discriminating activation patterns: support vector machine on functional MRI data. Neuroimage 28,980–995.doi: 10.1016/j.neuroimage.2005.06.070

2. Retico, A., Gori, I., Giuliano, A., Muratori, F., Calderoni, S. (2016). One-class Support Vector Machines identify the language and default mode regions as common patterns of structural alterations in young children with Autism Spectrum Disorders. Frontiers in Neuroscience 10:306. doi:10.3389/fnins.2016.00306

3. Moya, M.M., Koch, M.W., and Hostetler, L.D. (1993). One-Class Classifier Networks for Target Recognition Applications. NASA STI/Recon Technical Report N,Vol.93, 24043. http://adsabs.harvard.edu/abs/ 1993STIN...9324043M

4. Di Martino, A et al, The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism. Mol Psychiatry. 2014 Jun;19(6):659-67. doi: 10.1038/mp.2013.78. Epub 2013.


Fig. 1. Dataset composition and sample characteristics. Legend: Autism Brain Imaging Data Exchange (ABIDE) data cohort; Stella Maris Foundation (SMF) data cohort; autism spectrum disorder (ASD); controls (CTR); standard deviation (SD); full intelligence quotient (FIQ); non-verbal intelligence quotient (NVIQ). The acquisition parameter for the ABIDE collection can be found at http://fcon_1000.projects.nitrc.org/indi/abide/. SMF MRI data are T1-weighted series (FSPGR) with voxel size 1x1x1 mm3 were acquired with 1.5 T GE Signa Neuro-optimized System.

Fig. 2. Table showing the central age values for each age window considered in the SMF and the ABIDE datasets (first row). The partially overlapping age windows have a constant width for each dataset: 4 years for the ABIDE and 3 years for the SMF samples, respectively. The number of cases in each age window are reported (second row). Cases and controls from both the SMF and the ABIDE samples have been paired to allow implementing the leave-pair-out cross-validation of classifier performances. ASD and control subsamples are matched for age and NVIQ/FIQ in each bin.

Fig. 3. AUC values obtained training OCC-SVM either on ASD or control subjects. Performances around or below the chance level (0.5) mean that the hypersphere enclosing most positive cases (i.e. the training class) contains also the majority of cases of the other class, supporting the conclusion that the positive class does not show in this case strong common features that make it separable by the other class. By contrast, AUC>0.5 means that cases in the training class share some relevant common features, able to define a robust boundary between the two classes.

Proc. Intl. Soc. Mag. Reson. Med. 26 (2018)