Some of the material in is restricted to members of the community. By logging in, you may be able to gain additional access to certain collections or items. If you have questions about access or logging in, please use the form on the Contact Page.
The age of big data has re-invited much interest in dimension reduction. How to cope with high-dimensional data remains a difficult problem in statistical learning. In this study, we consider the task of dimension reduction---projecting data into a lower-rank subspace while preserving maximal information. We investigate the pitfalls of classical PCA, and propose a set of algorithm that functions under high dimension, extends to all exponential family distributions, performs feature selection at the mean time, and takes missing value into consideration. Based upon the best performing one, we develop the SG-PCA algorithm. With acceleration techniques and a progressive screening scheme, it demonstrates superior scalability and accuracy compared to existing methods. Concerned with the independence assumption of dimension reduction techniques, we propose a novel framework, the Generalized Indirect Dependency Learning (GIDL), to learn and incorporate association structure in multivariate statistical analysis. Without constraints on the particular distribution of the data, GIDL takes any pre-specified smooth loss function and is able to both extract and infuse its association into the regression, classification or dimension reduction problem. Experiments at the end serve to demonstrate its efficacy.
A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy.
Includes bibliographical references.
Yiyuan She, Professor Directing Dissertation; Teng Ma, University Representative; Xufeng Niu, Committee Member; Debajyoti Sinha, Committee Member; Elizabeth Slate, Committee Member.
Florida State University
Use and Reproduction
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s). The copyright in theses and dissertations completed at Florida State University is held by the students who author them.