Bharath Ambale Venkatesh^{1}

Machine learning methods are better suited for meaningful risk prediction in extensively phenotyped large-scale epidemiological studies than traditional methods or risk scores. This strategy could yield insights about specific use of variables for specific event prediction and guiding strategies to prevent cardiovascular disease outcomes. Potentially, these techniques could be applied retrospectively to analyze large data sets for identifying disease mechanisms, and as a means of hypothesis generation, without prior assumptions.

There is a lack of studies using machine learning techniques with deep phenotyping (multiple evaluations of different aspects of a specific disease process) for cardiovascular event prediction. This course will examine the ability of combining deep phenotyping with machine learning for cardiovascular event prediction, with data from MESA (Multi-Ethnic Study of Atherosclerosis) as an example. This course will introduce random survival forests–based method of risk prediction as a means to superior prediction and improved accuracy than established risk scores.

This course will also introduce an overall paradigm for the use of deep learning for automated image analysis in conjunction with statistical machine learning for event prediction and risk stratification.

The results of the algorithms and how to interpret them will be explained. The course will introduce methods to determine the value of including subclinical disease markers obtained from imaging, electrocardiography, and blood tests, using variable selection techniques. The course will introduce dimension-reduction techniques for the purpose of event prediction.

I thank Joao A Lima, David A Bluemke, Colin O Wu and Xiaoying Yang for their help in this work related to cardiovascular event prediction. I also would like to thank Alistair Young and Pau-Medrano-Gracia for their collaboration in work related to dimension-reduction techniques.

1. Ambale-Venkatesh, Bharath, et al. "Cardiovascular event prediction by machine learning: the Multi-Ethnic Study of Atherosclerosis." Circulation research (2017): CIRCRESAHA-117.

2. Ishwaran, H., Kogalur, U. B., Blackstone, E. H., & Lauer, M. S. (2008). Random survival forests. The annals of applied statistics, 841-860.

3. Tibshirani, R. (1997). The lasso method for variable selection in the Cox model. Statistics in medicine, 16(4), 385-395.

3. Medrano-Gracia, P., Cowan, B. R., Ambale-Venkatesh, B., Bluemke, D. A., Eng, J., Finn, J. P., ... & Young, A. A. (2014). Left ventricular shape variation in asymptomatic populations: the multi-ethnic study of atherosclerosis. Journal of Cardiovascular Magnetic Resonance, 16(1), 56.

4. Zhang, X., Ambale-Venkatesh, B., Bluemke, D. A., Cowan, B. R., Finn, J. P., Kadish, A. H., ... & Young, A. A. (2015). Information maximizing component analysis of left ventricular remodeling due to myocardial infarction. Journal of translational medicine, 13(1), 343.