Brandon Campbell^{1,2}, Gregory Simchick^{1,2}, Hang Yin^{3}, and Qun Zhao^{1,2}

Previous classification techniques for determining the quantification of white adipose tissue and brown adipose tissue have relied on using fat fraction and proton relaxation times using fixed peak spectroscopic models. Machine learning algorithms have proven to be highly accurate for image segmentation but their accuracies rely heavily on input datasets. By using the recently proposed Multi-Varying MR Spectroscopy model an increase in dataset specificity can be applied to each voxel by addition of varying fat peak intensity values. Using this new dataset, four machine learning models were compared.

**Introduction**

**Data acquisition and Methods**

MRI data was acquired from the inter-capsular BAT
(iBAT) and inguinal WAT (igWAT) regions of six healthy C57/BL6 mice using a 7T
Agilent small animal imaging scanner. A 2D multi-echo gradient-echo sequence
was used, consisting of 12 echoes with a spacing of 0.525ms and initial echo at
5.7ms, FOV of 25x25mm and a matrix of 256x256. Labels were applied to the data
based on the known locations of BAT, WAT and muscle in the scanned regions for
supervised learning. An Echo dataset was created using the complex echo signal
from 12 echoes during data acquisition. The real and imaginary components from
the data were organized by alternating real and imaginary components for each
echo. A Peak dataset consisting of five spectroscopic peaks derived from the
previously proposed Multi-Varying-Peak MR Spectroscopy method (MVP-MRS) plus
fat fraction and R2* was produced.^{4} Previous
multi-peak methods assume a fixed-peak amplitude for each spectroscopic fat
peaks, and therefore the MVP-MRS model allows these spectroscopic peak
amplitudes to be included in a feature set for classification.^{5}

Four
different supervised machine learning methods were applied to each dataset to compare
their accuracies: Support Vector Machine (SVM), Fully-Connected Neural Network
(FCNN), One-dimensional Convolutional Neural Network (1D-CNN), and a Two-dimensional Convolutional Neural Network (2D-CNN). **Figure 1** shows the architecture of the models. The inputs for each
of the methods except the 2D-CNN were flattened arrays of data from each
dataset for each voxel. The inputs for the 2D-CNN were extracted from the data
via patches of 9x9 voxels. The center voxel from each input corresponded to the
label of the patch, therefore allowing segmentation predictions to be made upon
this center voxel.

**Results and Discussions**

To determine the
accuracies of each of the models, ten separate runs were performed for each
model and the average value and standard deviation for training and validation
accuracies were calculated, **Figure 2**.
Confusion matrices and ROC curves were generated for each of the model’s
validation results, **Figure 3**.
The lowest testing and
validation accuracies when considering both datasets and every model is the SVM
using the Peak dataset with values 95.09% for training and 94.93% for
validation respectively. This is an
expected result as the SVM is a linear classifier with the least number of
parameters determining predictions. The model with the best performance is the
2D-CNN with the Echo dataset (99% and 98%), showing a small improvement over
the Peak dataset (97% and 96%). The inclusion of spatial information provided
by the 2D-CNN is likely the reason this model shows the best performance.

The
confusion matrices and ROC curves in **Figure 3** provide extra insight in the classification predictions of each model.
The major error in class differentiation comes from the difficulty in
separating WAT and BAT. The ROC curves show that in each case, at low
confidence values, false negatives for BAT present the biggest problem. This
can be seen by the red curve in each ROC plot. The area under this curve tends
to be slightly lower in each case compared to the other classes.

**Conclusion**

1. Cypess, A. M. et al. Identification and Importance of Brown Adipose Tissue in Adult Humans. N. Engl. J. Med. 360, 1509–1517 (2009).

2. Bhanu Prakash, K. N. et al. Segmentation and characterization of interscapular brown adipose tissue in rats by multi-parametric magnetic resonance imaging. Magn. Reson. Mater. Physics, Biol. Med. 29, 277–286 (2016).

3. Bhanu Prakash, K. N., Srour, H., Velan, S. S. & Chuang, K.-H. A method for the automatic segmentation of brown adipose tissue. Magn. Reson. Mater. Physics, Biol. Med. 29, 287–299 (2016).

4. Simchick, G., J. Wu, G. Shi, H. Yin, and Q. Z. Characterization of Brown Adipose Tissue using Multi-Varying-Peak MR Spectroscopy (MVP-MRS). in Proceedings of the Annual Conference of International Society of Magnetic Resonance in Medicine 3280 (2016).

5. Zancanaro, C. et al. Magnetic resonance spectroscopy investigations of brown adipose tissue and isolated brown adipocytes. J. Lipid Res. 35, 2191–2199 (1994).

Figure 1: A.) FCNN architecture comprised of 3-layers. The first-layer contains 60 nodes, the second-layer contains 30 nodes, and the final output is 3 nodes corresponding to BAT, WAT and muscle.B.) Five-layer 1D-CNN architecture with alternating convolution and max pooling layers leading to a final fully connected output. C.) Five-layer 2D-CNN architecture with alternating convolution and max pooling layers leading to a final fully connected output. The difference between 1D- and 2D-CNNs are the dimensions of the inputs. The 2D-CNN inputs are 3x3 voxel patches totaling the same number of inputs as the 1D-CNN.

Figure 2: Model accuracies and standard deviations for each of the machine
learning models and each dataset. Training accuracies describe the accuracies
of the set of data used during the training process and upon which SGD is used
to adjust model weights.

Figure 3: Confusion matrices and ROC curves for each of the models and datasets.
The confusion matrices show how well each model makes predictions on each
class. The rows of the matrices are predicted labels from the machine learning
algorithm, while columns are the true labels of the data. ROC curves illustrate
how well a model makes predictions on each class at different confidence
thresholds. The blue curve represents WAT, the red curve is BAT and the green
curve is muscle. The area under each curve is a measure of how accurate the
model is for a specific class.