Some of the material in is restricted to members of the community. By logging in, you may be able to gain additional access to certain collections or items. If you have questions about access or logging in, please use the form on the Contact Page.
Yu, K. (2016). Statistical Methods for Big Data and Their Applications in Biomedical Research. Retrieved from http://purl.flvc.org/fsu/fd/FSU_2016SP_Yu_fsu_0071E_13079
Big data has brought both opportunities and challenges to our research community. Complex models can be built with large volumes of data researchers have never had access before. In this study we explore the structure learning of Bayesian network (BN) and its application to reverse engineering of gene regulatory networks (GRNs). A Bayesian network is a graphical representation of a joint distribution that encodes the conditional dependencies and independencies among the variables. We proposed a novel three-stage BN structure learning method, called GRASP (GRowth-based Approach with Staged Pruning). In the first stage, a new skeleton (undirected edges) discovery method, double filtering (DF), was designed. Compared to existing methods, DF requires smaller sample sizes to achieve similar statistical power. Based on the skeleton estimated in the first step, we proposed a sequential Monte Carlo (SMC) method to sample the edges and their directions to optimize a BIC-based score. SMC method has less tendency to be trapped in local optima, and the computation is easily parallelizable. On the third stage, we reclaim the edges that may be missed from previous stages. We obtained satisfactory results from simulation study and applied the method to infer GRNs from real experimental data. A method on personalized chemotherapy regimen selection for breast cancer and a novel algorithm for relationship extraction from unstructured documents will be discussed as well.
Bayesian network structure learning, Neoadjuvent chemotherapy, Protein-protein-interaction, sequential Monte Carlo
Date of Defense
March 22, 2016.
Submitted Note
A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy.
Bibliography Note
Includes bibliographical references.
Advisory Committee
Jinfeng Zhang, Professor Directing Dissertation; Qing-Xiang Amy Sang, University Representative; Adrian Barbu, Committee Member; Yiyuan She, Committee Member; Debajyoti Sinha, Committee Member.
Publisher
Florida State University
Identifier
FSU_2016SP_Yu_fsu_0071E_13079
Use and Reproduction
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s). The copyright in theses and dissertations completed at Florida State University is held by the students who author them.
Yu, K. (2016). Statistical Methods for Big Data and Their Applications in Biomedical Research. Retrieved from http://purl.flvc.org/fsu/fd/FSU_2016SP_Yu_fsu_0071E_13079