Some of the material in is restricted to members of the community. By logging in, you may be able to gain additional access to certain collections or items. If you have questions about access or logging in, please use the form on the Contact Page.
Work is presented from two projects, each involving an application of machine learning to precision medicine. The first project was for the Document Triage Task of the BioCreative VI Precision Medicine Track. Teams were asked to build machine learning models to identify journal abstracts that contain at least one mention of a protein-protein interaction (PPI) affected by a mutation. The second project is an analysis of gene expression data from a group of breast cancer patients receiving neoadjuvant chemotherapy to search for biomarkers predicting the outcome of treatment. The model developed for the Biocreative challenge did not use state of the art methods but achieved results only slightly worse than modern deep learning techniques. My contribution to this project was in feature engineering, model tuning and model validation. The feature engineering process will be presented along with a discussion of difficulties due to scarcity of data. The data for the second project was collected from breast cancer patients at the Sun Yat-sen University Cancer Center in Guangzhou China. RNASeq data and clinical information were collected from patients before and after undergoing neoadjuvant chemotherapy. Genes and pathways of potential relevance to the outcome of neoadjuvant therapy were identified for further study.