Ariel Rokem^{1}

Models of diffusion MRI are mathematical expressions that describe the data, summarize it and approximate it. The values of the fit parameters are used to interpret the data in light of the structure of the tissue or its physical properties. This presentation will introduce a framework for model comparison using cross-validation: the model is fit to one part of the data and the model parameters are used to predict the signal in another part of the data. Cross-validation is used to assess parameter reliability, as well as accuracy of the model with respect to the data.

Diffusion MRI (dMRI) probes the structure of biological tissue by sensitizing the measurement to the diffusion of water molecules. The data, properly interpreted, can support different kinds of inferences about the measured tissue. One of the main analytic tools used to interpret the data are models: mathematical expressions that can be matched to the data by adjusting variable parameters. Models are useful because they may allow us to describe the data, summarize it and approximate it. In some cases, the values of the fit parameters can be used to interpret the data in light of the process or structure that generated the data. Because the expressions comprising models are generative, they can sometimes also be used to make predictions about other measurements. Over the years, researchers have proposed many different models to explain and interpret dMRI data, with a variety of different goals. In this presentation, we will contrast and compare several different kinds of models: Phenomenological models aim to accurately approximate the data, but their parameters do not necessarily map onto specific biophysical or microscopic properties of the tissue. This means that there is some ambiguity with respect to their interpretation. That said, phenomenological models, such as the diffusion tensor model (Basser et al. 1994) and diffusion kurtosis model (Jensen et al. 2005) have been very useful and are still widely used in applications of dMRI to study brain structure, its relation to cognition and behavior, and in clinical applications. Mechanistic models, on the other hand, point to specific biophysical mechanisms that affect the signal. A class of mechanistic models are mixture models that interprets the signal in each voxel as a combination of signals from different compartments that comprise partial volumes of the voxel. For example, modeling of perfusion and diffusion in tandem (Le Bihan et al. 1986), or accounting for partial volumes of CSF at the white-matter/CSF boundary or in brain pathologies (Pasternak et al., 2009; Hoy et al., 2014). Mixture models are also often used in applications of dMRI for tractography, because they allow for multiple directions of nerve fibers to be tracked even within a voxel (as reviewed in Rokem et al. 2015). Another class of mechanistic models exploits knowledge about the effects of microscopic structures on the diffusion of water. These microstructural models infer parameters such as the distribution of axonal diameters within a voxel (Assaf et al. 2008) or of other microstructural parameters (Ferizi et al. 2017).

Different models may be more appropriate in different datasets. To compare between different models within a dataset, we advocate using a statistical learning approach, primarily relying on a cross-validation procedure: the model is fit to one part of the data and the model parameters are used to predict the signal in another part of the data. Using cross-validation, two criteria can be used to evaluate different models: The first is model parameter reliability. This quantifies the degree to which the parameters of a model vary with different measurements, and assesses whether the inferences from the model are robust to variations due to noise. The other criterion is model signal accuracy. This is the degree to which a model prediction matches the measured signal. A useful benchmark for model prediction accuracy is test-retest reliability. We introduce the relative RMSE metric (Rokem et al. 2015): a ratio between the prediction error and error in test-retest. If a model prediction is more accurate than test-retest reliability, this means that it captures systematic variability in the signal, and may be useful. In this case, rRMSE is smaller than 1. If a model prediction is indistinguishable from the correct model, rRMSE converges to 1/sqrt(2). This gives model evaluations in different datasets a natural scale for comparison.

Rokem, A., Yeatman, J., Pestilli, F., Kay, K.N., Mezer, A., van der Walt, S., & Wandell B.A. (2015). Evaluating the accuracy of diffusion MRI models in white matter. PLoS One, DOI: 10.1371/journal.pone.0123272.

Jensen, J. H., Helpern, J. A., Ramani, A., Lu, H., & Kaczynski, K. (2005). Diffusional kurtosis imaging: The quantification of non‐gaussian water diffusion by means of magnetic resonance imaging. Magnetic resonance in medicine, 53(6), 1432-1440.

Le Bihan, D., Breton, E., Lallemand, D., Aubin, M. L., Vignaud, J., & Laval-Jeantet, M. (1988). Separation of diffusion and perfusion in intravoxel incoherent motion MR imaging. Radiology, 168(2), 497-505.

Basser, P. J., Mattiello, J., & LeBihan, D. (1994). MR diffusion tensor spectroscopy and imaging. Biophysical journal, 66(1), 259-267.

Ferizi, U., Scherrer, B., Schneider, T., Alipoor, M., Eufracio, O., Fick, R. H., ... & Poot, D. H. (2017). Diffusion MRI microstructure models with in vivo human brain Connectome data: results from a multi‐group comparison. NMR in Biomedicine, 30(9).

Pasternak, O., Sochen, N., Gur, Y., Intrator, N., & Assaf, Y. (2009). Free water elimination and mapping from diffusion MRI. Magnetic resonance in medicine, 62(3), 717-730.

Hoy, A. R., Koay, C. G., Kecskemeti, S. R., & Alexander, A. L. (2014). Optimization of a free water elimination two-compartment model for diffusion tensor imaging. NeuroImage, 103, 323-333.

Assaf, Y., Blumenfeld‐Katzir, T., Yovel, Y., & Basser, P. J. (2008). AxCaliber: a method for measuring axon diameter distribution from diffusion MRI. Magnetic resonance in medicine, 59(6), 1347-1354.

A cross-validation procedure relies on fitting the model to one portion of the data and testing the model fit on another portion of the data. We benchmark the model fit errors relative to the errors in test-retest. The ratio between these error terms is the rRMSE (relative root of the mean squared error).