Some of the material in is restricted to members of the community. By logging in, you may be able to gain additional access to certain collections or items. If you have questions about access or logging in, please use the form on the Contact Page.
Some of the material in is restricted to members of the community. By logging in, you may be able to gain additional access to certain collections or items. If you have questions about access or logging in, please use the form on the Contact Page.
Statisticians often encounter data in the form of a combination of discrete and continuous outcomes. A special case is zero-inflated longitudinal data where the response variable has a large portion of zeros. These data exhibit...
Evaluating the performance of models predicting a binary outcome can be done using a variety of measures. While some measures intend to describe the model's overall fit, others more accurately describe the model's ability to discriminate...
In the classical literature of Statistics, a large amount of methods have been addressed for data analysis on Euclidean space. Over the past few decades, however, a growing interest has been devoted to non-Euclidean data analysis. In...
Polychotomous quantal response models are widely used in medical and econometric studies to analyze categorical or ordinal data. In this study, we apply the Bayesian methodology through a mixed-effects polychotomous quantal response...
Image analysis often requires dimension reduction before statistical analysis, in order to apply sophisticated procedures. Motivated by eventual applications, a variety of criteria have been proposed: reconstruction error, class...
Two main challenges in computational biology are identify differential expressed genes from gene expression data and find out biological variable interactions from genomics data. This dissertation presents two studies in each of them. In...
Recurrent events data are rising in all areas of biomedical research. We present a model for recurrent events data with the same link for the intensity and mean functions. Simple interpretations of the covariate effects on both the...
With rapid advances in data acquisition and storage techniques, modern scientific investigations in epidemiology, genomics, imaging and networks are increasingly producing challenging data structures in the form of high-dimensional...
Statistical depth, a commonly used analytic tool in non-parametric statistics, has been extensively studied for multivariate and functional observations over the past few decades. Although various forms of depth were introduced, they are...
Motivated by understanding the devastating financial crisis in 2008 that was partially caused by underestimation of financial risk, we propose a class of time-varying mixture models for risk analysis and management. There are various...
Multivariate response models are being used increasingly more in almost all fields with the necessary employment of inferential methods such as Canonical Correlation Analysis (CCA). This requires the estimation of the number of...
We develop a modeling framework to simultaneously evaluate various types of predictability in stock returns, including stocks' sensitivity ("betas") to systematic risk factors, stocks' abnormal returns unexplained by risk factors (...
Our view is that while some of the basic principles of data analysis are going to remain unchanged, others are to be gradually replaced with Geometry and Topology methods. Linear methods are still making sense for functional data...
Convolutional Neural Networks (CNNs) are widely used and have an impressive performance in detecting and classifying objects. However, the CNN's performance is sensitive to variations in rotation, position or scaling of the objects to be...
This research work is an attempt to illustrate the versatility and wide applications of the field of statistical science. Specifically, the research work involves the application of statistics in the field of law. The application will...
With the increasing popularity of information technology, especially electronic imaging techniques, large amount of high dimensional data such as 3D shapes become pervasive in science, engineering and even people's daily life, in the...
The high mortality rate and huge expenditure caused by dementia makes it a pressing concern for public health researchers. Among the potential risk factors in diet and nutrition, the relation between alcohol usage and dementia has been...
The Barker Hypothesis states that maternal and `in utero' attributes during pregnancy affects a child's cardiovascular health throughout life. We present an analysis of a unique longitudinal dataset from Jamaica that consists of three...
Recent advances in computing and measurement technologies have led to an explosion in the amount of data that are being collected in many areas of application. Much of these data have network or graph structures, and they are common in...
In this study, we will examine the Bayesian Dynamic Survival Models, time-varying coefficients models from a Bayesian perspective, and their applications in the aging setting. The specific questions we are interested in are: Do the...
Longitudinal studies are widely used in various fields, such as public health, clinic trials and financial data analysis. A major challenge for longitudinal studies is repeated measurements from each subject, which cause time dependent...
Shape analysis of curves and surfaces is a very important tool in many applications ranging from computer vision to bioinformatics and medical imaging. There are many difficulties when analyzing shapes of parameterized curves and...
A major interest of survival analysis is to assess covariate effects on survival via appropriate conditional hazard function regression models. The Cox proportional hazards model, which assumes an exponential form for the relative risk, ...
The technological advances in recent years have produced a wealth of intricate digital imaging data that is analyzed effectively using the principles of shape analysis. Such data often lies on either high-dimensional or infinite...
This thesis consists of two distinct topics. First, we present a framework for estimation and analysis of trajectories on Riemananian manifolds. Second, we propose a framework of detecting, classifying, and estimating shapes in point...
Utilizing high throughput gene expression data stored in public archives not only saves research time and cost but also enhances the power of its statistical support. However, gene expression profiling data can be obtained from many...
Methods employed in the construction of prediction bands for continuous curves require a dierent approach to those used for a data point. In many cases, the underlying function is unknown and thus a distribution-free approach which...
In the first project, we propose to generalize the notion of depth in temporal point process observations. The new depth is defined as a weighted product of two probability terms: 1) the number of events in each process, and 2) the...
Forecasting a univariate target time series in high dimensions with very many predictors poses challenges in statistical learning and modeling. First, many nuisance time series exist and need to be removed. Second, from economic theories...
This dissertation is on analysis of invariants of a 3D configuration from its 2D images in pictures of this configuration, without requiring any restriction on the camera positioning relative to the scene pictured. We briefly review some...
Statistical depth functions have been well studied for multivariate data and functional data but remained under-explored for point process until very recently Liu and Wu made their first attempt. Generally, neither depth functions for...
In this thesis we investigate post-model selection properties of L1 penalized weighted least squares estimators in regression models with a large number of variables M and correlated errors. We focus on correct subset selection and on...
In this thesis, based on an orthonormal series expansion, we propose a new nonparametric method to estimate copula density functions. Since the basis coefficients turn out to be expectations, empirical averages are used to estimate these...
We study group variable selection on multivariate regression model. Group variable selection is equivalent to select the non-zero rows of coefficient matrix, since there are multiple response variables and thus if one predictor is...
We perform a quasi-3D Bayesian inversion of oceanographic tracer data from the South Atlantic Ocean. Initially we are considering one active neutral density layer with an upper and lower boundary. The available hydrographic data is...
This dissertation studies statistical shape analysis of planar objects. The focus is on two different representations. The first one considers only the boundary of planar shapes, a comprehensive analysis framework including...
The generalized linear model and particularly the logistic model are widely used in public health, medicine, and epidemiology. Goodness-of-fit tests for these models are popularly used to describe how well a proposed model fits a set of...
This dissertation is on analysis of invariants of a 3D configuration from its 2D images in pictures of this configuration, without requiring any restriction on the camera positioning relative to the scene pictured. We briefly review some...
This dissertation introduces and assesses an algorithm to generate confidence bands for a regression function or a main effect when multiple data sets are available. In particular it proposes to construct confidence bands for different...
In this dissertation, we focus on the problem of analyzing high-dimensional functional data using geometric approaches. The term functional data refers to images, densities and trajectories on manifolds. The nature of these data imposes...
Envelope model is a nascent dimension reduction technique. We focus on extending the envelope methodology to broader applications. In the first part of this thesis we propose a common reducing subspace model that can simultaneously...
In this essay we present analysis examining the basic dietary structure and its relationship to mortality in the first National Health and Nutrition Examination Survey (NHANES I) conducted between 1971 and 1975. We used results from 24...
Identifying influential observations in the data is desired to ensure proper inference and statistical analysis. Modern methods to identify influence cases uses cross-validation diagnostics based on the effect of deletion of i-th...
This dissertation develops novel single random effect models as well as bivariate correlated random effects model for clustered data with bivariate mixed responses. Logit and identity link functions are used for the binary and continuous...
In this thesis we investigate statistical modelling of neural activity in the brain. We first develop a framework which is an extension of the state-space Generalized Linear Model (GLM) by Eden and colleagues [20] to include the effects...
In this dissertation, we examine binary correlated data with present/absent component or missing data that are related to binary responses of interest. Depending on the data structure, correlated binary data can be referred as emph...
For many biomedical, environmental and economic studies with an unknown non-linear relationship between the response and its multiple predictors, a single index model provides practical dimension reduction and good physical...
Background: Genomic and epigenomic data analyses has been a popular research area in the 21st century. Common research problems include detecting differentially expressed genes between groups, clustering and classification using genomic...
We examine the impact of missing data in two settings, the development of prognostic models and the addition of new risk factors to existing risk functions. Most statistical software presently available perform complete case analysis, ...
Determinantal point processes (DPPs), which can be dened by their correlation kernels with known moments, are useful models for point patterns where nearby points exhibit repulsion. They have many nice properties, such as closed-form...
Some of the material in is restricted to members of the community. By logging in, you may be able to gain additional access to certain collections or items. If you have questions about access or logging in, please use the form on the Contact Page.