This project will focus on the development of a reproducible data mining analysis pipeline for the application to target biomedical prediction modeling applications. The elements of the analysis pipeline to be developed over the course of this project will include basic (1) data cleaning, (2) data transformation or feature construction, (3) feature selection, (4) machine learning modeling, (5) statistical analyses of results, and (6) interpretation of the results through the characterization of patterns of association with model visualization. This project will focus on the incorporation of more advanced machine learning approaches that have the ability to detect complex patterns of association between independent variables and the disease outcome of interest. In particular, this project will focus on the use of random forests, basic convolutional neural networks, and learning classifier systems, as machine learning approaches.