Current Search: Research Repository (x) » Statistics (x) » Sinha, Debajyoti (x)
Search results
Pages
 Title
 2D Affine and Projective Shape Analysis, and Bayesian Elastic Active Contours.
 Creator

Bryner, Darshan W., Srivastava, Anuj, Klassen, Eric, Gallivan, Kyle, Huffer, Fred, Wu, Wei, Zhang, Jinfeng, Department of Statistics, Florida State University
 Abstract/Description

An object of interest in an image can be characterized to some extent by the shape of its external boundary. Current techniques for shape analysis consider the notion of shape to be invariant to the similarity transformations (rotation, translation and scale), but often times in 2D images of 3D scenes, perspective effects can transform shapes of objects in a more complicated manner than what can be modeled by the similarity transformations alone. Therefore, we develop a general Riemannian...
Show moreAn object of interest in an image can be characterized to some extent by the shape of its external boundary. Current techniques for shape analysis consider the notion of shape to be invariant to the similarity transformations (rotation, translation and scale), but often times in 2D images of 3D scenes, perspective effects can transform shapes of objects in a more complicated manner than what can be modeled by the similarity transformations alone. Therefore, we develop a general Riemannian framework for shape analysis where metrics and related quantities are invariant to larger groups, the affine and projective groups, that approximate such transformations that arise from perspective skews. Highlighting two possibilities for representing object boundaries  ordered points (or landmarks) and parametrized curves  we study different combinations of these representations (points and curves) and transformations (affine and projective). Specifically, we provide solutions to three out of four situations and develop algorithms for computing geodesics and intrinsic sample statistics, leading up to Gaussiantype statistical models, and classifying test shapes using such models learned from training data. In the case of parametrized curves, an added issue is to obtain invariance to the reparameterization group. The geodesics are constructed by particularizing the pathstraightening algorithm to geometries of current manifolds and are used, in turn, to compute shape statistics and Gaussiantype shape models. We demonstrate these ideas using a number of examples from shape and activity recognition. After developing such Gaussiantype shape models, we present a variational framework for naturally incorporating these shape models as prior knowledge in guidance of active contours for boundary extraction in images. This socalled Bayesian active contour framework is especially suitable for images where boundary estimation is difficult due to low contrast, low resolution, and presence of noise and clutter. In traditional active contour models curves are driven towards minimum of an energy composed of image and smoothing terms. We introduce an additional shape term based on shape models of prior known relevant shape classes. The minimization of this total energy, using iterated gradientbased updates of curves, leads to an improved segmentation of object boundaries. We demonstrate this Bayesian approach to segmentation using a number of shape classes in many imaging scenarios including the synthetic imaging modalities of SAS (synthetic aperture sonar) and SAR (synthetic aperture radar), which are notoriously difficult to obtain accurate boundary extractions. In practice, the training shapes used for priorshape models may be collected from viewing angles different from those for the test images and thus may exhibit a shape variability brought about by perspective effects. Therefore, by allowing for a prior shape model to be invariant to, say, affine transformations of curves, we propose an active contour algorithm where the resulting segmentation is robust to perspective skews.
Show less  Date Issued
 2013
 Identifier
 FSU_migr_etd8534
 Format
 Thesis
 Title
 Adaptive Series Estimators for Copula Densities.
 Creator

Gui, Wenhao, Wegkamp, Marten, Van Engelen, Robert A., Niu, Xufeng, Huﬀer, Fred, Department of Statistics, Florida State University
 Abstract/Description

In this thesis, based on an orthonormal series expansion, we propose a new nonparametric method to estimate copula density functions. Since the basis coefficients turn out to be expectations, empirical averages are used to estimate these coefficients. We propose estimators of the variance of the estimated basis coefficients and establish their consistency. We derive the asymptotic distribution of the estimated coefficients under mild conditions. We derive a simple oracle inequality for the...
Show moreIn this thesis, based on an orthonormal series expansion, we propose a new nonparametric method to estimate copula density functions. Since the basis coefficients turn out to be expectations, empirical averages are used to estimate these coefficients. We propose estimators of the variance of the estimated basis coefficients and establish their consistency. We derive the asymptotic distribution of the estimated coefficients under mild conditions. We derive a simple oracle inequality for the copula density estimator based on a finite series using the estimated coefficients. We propose a stopping rule for selecting the number of coefficients used in the series and we prove that this rule minimizes the mean integrated squared error. In addition, we consider hard and soft thresholding techniques for sparse representations. We obtain oracle inequalities that hold with prescribed probability for various norms of the difference between the copula density and our threshold series density estimator. Uniform confidence bands are derived as well. The oracle inequalities clearly reveal that our estimator adapts to the unknown degree of sparsity of the series representation of the copula density. A simulation study indicates that our method is extremely easy to implement and works very well, and it compares favorably to the popular kernel based copula density estimator, especially around the boundary points, in terms of mean squared error. Finally, we have applied our method to an insurance dataset. After comparing our method with the previous data analyses, we reach the same conclusion as the parametric methods in the literature and as such we provide additional justification for the use of the developed parametric model.
Show less  Date Issued
 2009
 Identifier
 FSU_migr_etd3929
 Format
 Thesis
 Title
 Algorithmic Lung Nodule Analysis in Chest Tomography Images: Lung Nodule Malignancy Likelihood Prediction and a Statistical Extension of the Level Set Image Segmentation Method.
 Creator

Hancock, Matthew C. (Matthew Charles), Magnan, Jeronimo Francisco, Duke, D. W., Hurdal, Monica K., Mio, Washington, Florida State University, College of Arts and Sciences,...
Show moreHancock, Matthew C. (Matthew Charles), Magnan, Jeronimo Francisco, Duke, D. W., Hurdal, Monica K., Mio, Washington, Florida State University, College of Arts and Sciences, Department of Mathematics
Show less  Abstract/Description

Lung cancer has the highest mortality rate of all cancers in both men and women in the United States. The algorithmic detection, characterization, and diagnosis of abnormalities found in chest CT scan images can aid radiologists by providing additional medicallyrelevant information to consider in their assessment of medical images. Such algorithms, if robustly validated in clinical settings, carry the potential to improve the health of the general population. In this thesis, we first give an...
Show moreLung cancer has the highest mortality rate of all cancers in both men and women in the United States. The algorithmic detection, characterization, and diagnosis of abnormalities found in chest CT scan images can aid radiologists by providing additional medicallyrelevant information to consider in their assessment of medical images. Such algorithms, if robustly validated in clinical settings, carry the potential to improve the health of the general population. In this thesis, we first give an analysis of publicly available chest CT scan annotation data, in which we determine upper bounds on expected classification accuracy when certain radiological features are used as inputs to statistical learning algorithms for the purpose of inferring the likelihood of a lung nodule as being either malignant or benign. Second, a statistical extension of the level set method for image segmentation is introduced and applied to both syntheticallygenerated and real threedimensional image volumes of lung nodules in chest CT scans, obtaining results comparable to the current stateoftheart on the latter.
Show less  Date Issued
 2018
 Identifier
 2018_Sp_Hancock_fsu_0071E_14427
 Format
 Thesis
 Title
 Analysis of crossclassified data using negative binomial models.
 Creator

Ramakrishnan, Viswanathan., Florida State University
 Abstract/Description

Several procedures are available for analyzing crossclassified data under the Poisson model. When data suggest the presence of "nonPoisson" variation an alternative model is desirable. Often a negative binomial model is useful as an alternative. In this dissertation methodology for analyzing data under a twoparameter negative binomial model is provided. A conditional likelihood approach is suggested to simplify estimation and inference procedures. Large sample properties of the conditional...
Show moreSeveral procedures are available for analyzing crossclassified data under the Poisson model. When data suggest the presence of "nonPoisson" variation an alternative model is desirable. Often a negative binomial model is useful as an alternative. In this dissertation methodology for analyzing data under a twoparameter negative binomial model is provided. A conditional likelihood approach is suggested to simplify estimation and inference procedures. Large sample properties of the conditional likelihood approach are derived. Based on simulations these properties are examined for small samples. The suggested methodology is applied to two sets of data from ecological research studies.
Show less  Date Issued
 1989, 1989
 Identifier
 AAI9016503, 3161994, FSDT3161994, fsu:78193
 Format
 Document (PDF)
 Title
 An analysis of test reliability.
 Creator

Isaacson, Fenton R., Florida State University
 Abstract/Description

"The need for efficient means of testing has long been recognized. To obtain efficiency in testing requires the study of four attributes of the testing instrumentnamely: reliability, validity, interpretability and administrability. It is the purpose of this paper to examine in some detail the first of these attributes, reliability. In particular, this is an attempt to analyse the reliability of Mathematics 101 Test D which was administered at Florida State University in the fall of 1948"...
Show more"The need for efficient means of testing has long been recognized. To obtain efficiency in testing requires the study of four attributes of the testing instrumentnamely: reliability, validity, interpretability and administrability. It is the purpose of this paper to examine in some detail the first of these attributes, reliability. In particular, this is an attempt to analyse the reliability of Mathematics 101 Test D which was administered at Florida State University in the fall of 1948"Introduction.
Show less  Date Issued
 1949
 Identifier
 FSU_historic_AKP4870
 Format
 Thesis
 Title
 AP Student Visual Preferences for Problem Solving.
 Creator

Swoyer, Liesl, Department of Statistics
 Abstract/Description

The purpose of this study is to explore the mathematical preference of high school AP Calculus students by examining their tendencies for using differing methods of thought. A student's preferred mode of thinking was measured on a scale ranging from a preference for analytical thought to a preference for visual thought as they completed derivative and antiderivative tasks presented both algebraically and graphically. This relates to previous studies by continuing to analyze the factors that...
Show moreThe purpose of this study is to explore the mathematical preference of high school AP Calculus students by examining their tendencies for using differing methods of thought. A student's preferred mode of thinking was measured on a scale ranging from a preference for analytical thought to a preference for visual thought as they completed derivative and antiderivative tasks presented both algebraically and graphically. This relates to previous studies by continuing to analyze the factors that have been found to mediate the students' performance and preference in regards to a variety of calculus tasks. Data was collected by Dr. Erhan Haciomeroglu at the University of Central Florida. Students' preferences were not affected by gender. Students were found to approach graphical and algebraic tasks similarly, without any significant change with regards to derivative or antiderivative nature of the tasks. Highly analytic and highly visual students revealed the same proportion of change in visuality as harmonic students when more difficult calculus tasks were encountered. Thus, a strong preference for visual thinking when completing algebraic tasks was not the determining factor of their preferred method of thinking when approaching graphical tasks.
Show less  Date Issued
 2012
 Identifier
 FSU_migr_uhm0052
 Format
 Thesis
 Title
 A Bayesian Approach to MetaRegression: The Relationship Between Body Mass Index and AllCause Mortality.
 Creator

Marker, Mahtab, McGee, Dan, Hurt, Myra, Niu, Xiufeng, Huﬀer, Fred, Department of Statistics, Florida State University
 Abstract/Description

This thesis presents a Bayesian approach to MetaRegression and Individual Patient Data (IPD) Metaanalysis. The focus of the research is on establishing the relationship between Body Mass Index (BMI) and allcause mortality. This has been an area of continuing interest in the medical and public health communities and no concensus has been reached on what the optimal weight for individuals is. Standards are usually speci ed in terms of body mass index (BMI = wt(kg) over height(m)2 ) which is...
Show moreThis thesis presents a Bayesian approach to MetaRegression and Individual Patient Data (IPD) Metaanalysis. The focus of the research is on establishing the relationship between Body Mass Index (BMI) and allcause mortality. This has been an area of continuing interest in the medical and public health communities and no concensus has been reached on what the optimal weight for individuals is. Standards are usually speci ed in terms of body mass index (BMI = wt(kg) over height(m)2 ) which is associated with body fat percentage. Many studies in the literature have modelled the relationship between BMI and mortality and reported a variety of relationships including Ushaped, Jshaped and linear curves. The aim of my research was to use statistical methods to determine whether we can combine these diverse results an obtain single estimated relationship, using which one can nd the point of minimum mortality and establish reasonable ranges for optimal BMI or how we can best examine the reasons for the heterogeneity of results. Commonly used techniques of Metaanalysis and Metaregression are explored and a problem with the estimation procedure in the multivariate setting is presented. A Bayesian approach using Hierarchical Generalized Linear Mixed Model is suggested and implemented to overcome this drawback of standard estimation techniques. Another area which is explored briefly is that of Individual Patient Data metaanalysis. A Frailty model or Random Effects Proportional Hazards Survival model approach is proposed to carry out IPD metaregression and come up with a single estimated relationship between BMI and mortality, adjusting for the variation between studies.
Show less  Date Issued
 2007
 Identifier
 FSU_migr_etd2736
 Format
 Thesis
 Title
 Bayesian Dynamic Survival Models for Longitudinal Aging Data.
 Creator

He, Jianghua, McGee, Daniel L., Niu, Xufeng, Johnson, Suzanne B., Huﬀer, Fred W., Department of Statistics, Florida State University
 Abstract/Description

In this study, we will examine the Bayesian Dynamic Survival Models, timevarying coefficients models from a Bayesian perspective, and their applications in the aging setting. The specific questions we are interested in are: Do the relative importance of characteristics measured at a particular age, such as blood pressure, smoking, and body weight, with respect to heart diseases or death change as people age? If they do, how can we model the change? And, how does the change affect the...
Show moreIn this study, we will examine the Bayesian Dynamic Survival Models, timevarying coefficients models from a Bayesian perspective, and their applications in the aging setting. The specific questions we are interested in are: Do the relative importance of characteristics measured at a particular age, such as blood pressure, smoking, and body weight, with respect to heart diseases or death change as people age? If they do, how can we model the change? And, how does the change affect the analysis results if fixedeffect models are applied? In the epidemiological and statistical literature, the relationship between a risk factor and the risk of an event is often described in terms of the numerical contribution of the risk factor to the total risk within a followup period, using methods such as contingency tables and logistic regression models. With the development of survival analysis, another method named the Proportional Hazards Model becomes more popular. This model describes the relationship between a covariate and risk within a followup period as a process, under the assumption that the hazard ratio of the covariate is fixed during the followup period. Neither previous methods nor the Proportional Hazards Model allows the effect of a covariates to change flexibly with time. In these study, we intend to investigate some classic epidemiological relationships using appropriate methods that allow coefficients to change with time, and compare our results with those found in the literature. After describing what has been done in previous work based on multiple logistic regression or discriminant function analysis, we summarize different methods for estimating the time varying coefficient survival models that are developed specifically for the situations under which the proportional hazards assumption is violated. We will focus on the Bayesian Dynamic Survival Model because its flexibility and Bayesian structure fits our study goals. There are two estimation methods for the Bayesian Dynamic Survival Models, the Linear Bayesian Estimation (LBE) method and the Markov Chain Monte Carlo (MCMC) sampling method. The LBE method is simpler, faster, and more flexible to calculate, but it requires specifications of some parameters that usually are unknown. The MCMC method gets around the difficulty of specifying parameters, but is much more computationally intensive. We will use a simulation study to investigate the performances of these two methods, and provide suggestions on how to use them effectively in application. The Bayesian Dynamic Survival Model is applied to the Framingham Heart Study to investigate the timevarying effects of covariates such as gender, age, smoking, and SBP (Systolic Blood Pressure) with respect to death. We also examined the changing relationship between BMI (Body Mass Index) and allcause mortality, and suggested that some of the heterogeneity observed in the results found in the literature is likely to be a consequence of using fixed effect models to describe a timevarying relationship.
Show less  Date Issued
 2007
 Identifier
 FSU_migr_etd4174
 Format
 Thesis
 Title
 Bayesian Generalized Polychotomous Response Models and Applications.
 Creator

Yang, Fang, Niu, XuFeng, Johnson, Suzanne B., McGee, Dan, Huﬀer, Fred, Department of Statistics, Florida State University
 Abstract/Description

Polychotomous quantal response models are widely used in medical and econometric studies to analyze categorical or ordinal data. In this study, we apply the Bayesian methodology through a mixedeffects polychotomous quantal response model. For the Bayesian polychotomous quantal response model, we assume uniform improper priors for the regression coeffcients and explore the suffcient conditions for a proper joint posterior distribution of the parameters in the models. Simulation results from...
Show morePolychotomous quantal response models are widely used in medical and econometric studies to analyze categorical or ordinal data. In this study, we apply the Bayesian methodology through a mixedeffects polychotomous quantal response model. For the Bayesian polychotomous quantal response model, we assume uniform improper priors for the regression coeffcients and explore the suffcient conditions for a proper joint posterior distribution of the parameters in the models. Simulation results from Gibbs sampling estimates will be compared to traditional maximum likelihood estimates to show the strength that using the uniform improper priors for the regression coeffcients. Motivated by investigating of relationship between BMI categories and several risk factors, we carry out the application studies to examine the impact of risk factors on BMI categories, especially for categories of "Overweight" and "Obesities". By applying the mixedeffects Bayesian polychotomous response model with uniform improper priors, we would get similar interpretations of the association between risk factors and BMI, comparing to literature findings.
Show less  Date Issued
 2010
 Identifier
 FSU_migr_etd1092
 Format
 Thesis
 Title
 Bayesian Models for Capturing Heterogeneity in Discrete Data.
 Creator

Geng, Junxian, Slate, Elizabeth H., Pati, Debdeep, Schmertmann, Carl P., Zhang, Xin, Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

Population heterogeneity exists frequently in discrete data. Many Bayesian models perform reasonably well in capturing this subpopulation structure. Typically, the Dirichlet process mixture model (DPMM) and a variable dimensional alternative that we refer to as the mixture of finite mixtures (MFM) model are used, as they both have natural byproducts of clustering derived from Polya urn schemes. The first part of this dissertation focuses on a model for the association between a binary...
Show morePopulation heterogeneity exists frequently in discrete data. Many Bayesian models perform reasonably well in capturing this subpopulation structure. Typically, the Dirichlet process mixture model (DPMM) and a variable dimensional alternative that we refer to as the mixture of finite mixtures (MFM) model are used, as they both have natural byproducts of clustering derived from Polya urn schemes. The first part of this dissertation focuses on a model for the association between a binary response and binary predictors. The model incorporates Boolean combinations of predictors, called logic trees, as parameters arising from a DPMM or MFM. Joint modeling is proposed to solve the identifiability issue that arises when using a mixture model for a binary response. Different MCMC algorithms are introduced and compared for fitting these models. The second part of this dissertation is the application of the mixture of finite mixtures model to community detection problems. Here, the communities are analogous to the clusters in the earlier work. A probabilistic framework that allows simultaneous estimation of the number of clusters and the cluster configuration is proposed. We prove clustering consistency in this setting. We also illustrate the performance of these methods with simulation studies and discuss applications.
Show less  Date Issued
 2017
 Identifier
 FSU_2017SP_Geng_fsu_0071E_13791
 Format
 Thesis
 Title
 A Bayesian MRF Framework for Labeling Terrain Using Hyperspectral Imaging.
 Creator

Neher, Robert E., Srivastava, Anuj, Liu, Xiuwen, Huffer, Fred, Wegkamp, Marten, Department of Statistics, Florida State University
 Abstract/Description

We explore the nonGaussianity of hyperspectral data and present probability models that capture variability of hyperspectral images. In particular, we present a nonparametric probability distribution that models the distribution of the hyperspectral data after reducing the dimension of the data via either principal components or Fisher's discriminant analysis. We also explore the directional differences in observed images and present two parametric distributions, the generalized Laplacian...
Show moreWe explore the nonGaussianity of hyperspectral data and present probability models that capture variability of hyperspectral images. In particular, we present a nonparametric probability distribution that models the distribution of the hyperspectral data after reducing the dimension of the data via either principal components or Fisher's discriminant analysis. We also explore the directional differences in observed images and present two parametric distributions, the generalized Laplacian and the Bessel K form, that well model the nonGaussian behavior of the directional differences. We then propose a model that labels each spatial site, using Bayesian inference and Markov random fields, that incorporates the information of the nonparametric distribution of the data, and the parametric distributions of the directional differences, along with a prior distribution that favors smooth labeling. We then test our model on actual hyperspectral data and present the results of our model, using the Washington D.C. Mall and Indian Springs rural area data sets.
Show less  Date Issued
 2004
 Identifier
 FSU_migr_etd2691
 Format
 Thesis
 Title
 Bayesian nonparametric estimation via Gibbs sampling for coherent systems with redundancy.
 Creator

Lawson, Kevin Lee., Florida State University
 Abstract/Description

We consider a coherent system S consisting of m independent components for which we do not know the distributions of the components' lifelengths. If we know the structure function of the system, then we can estimate the distribution of the system lifelength by estimating the distributions of the lifelengths of the individual components. Suppose that we can collect data under the 'autopsy model', wherein a system is run until a failure occurs and then the status (functioning or dead) of each...
Show moreWe consider a coherent system S consisting of m independent components for which we do not know the distributions of the components' lifelengths. If we know the structure function of the system, then we can estimate the distribution of the system lifelength by estimating the distributions of the lifelengths of the individual components. Suppose that we can collect data under the 'autopsy model', wherein a system is run until a failure occurs and then the status (functioning or dead) of each component is obtained. This test is repeated n times. The autopsy statistics consist of the age of the system at the time of breakdown and the set of parts that are dead by the time of breakdown. Using the structure function and the recorded status of the components, we then classify the failure time of each component. We develop a nonparametric Bayesian estimate of the distributions of the component lifelengths and then use this to obtain an estimate of the distribution of the lifelength of the system. The procedure is applicable to machinetest settings wherein the machines have redundant designs. A parametric procedure is also given.
Show less  Date Issued
 1994, 1994
 Identifier
 AAI9502812, 3088467, FSDT3088467, fsu:77272
 Format
 Document (PDF)
 Title
 Bayesian Portfolio Optimization with TimeVarying Factor Models.
 Creator

Zhao, Feng, Niu, Xufeng, Cheng, Yingmei, Huﬀer, Fred W., Zhang, Jinfeng, Department of Statistics, Florida State University
 Abstract/Description

We develop a modeling framework to simultaneously evaluate various types of predictability in stock returns, including stocks' sensitivity ("betas") to systematic risk factors, stocks' abnormal returns unexplained by risk factors ("alphas"), and returns of risk factors in excess of the riskfree rate ("risk premia"). Both firmlevel characteristics and macroeconomic variables are used to predict stocks' timevarying alphas and betas, and macroeconomic variables are used to predict the risk...
Show moreWe develop a modeling framework to simultaneously evaluate various types of predictability in stock returns, including stocks' sensitivity ("betas") to systematic risk factors, stocks' abnormal returns unexplained by risk factors ("alphas"), and returns of risk factors in excess of the riskfree rate ("risk premia"). Both firmlevel characteristics and macroeconomic variables are used to predict stocks' timevarying alphas and betas, and macroeconomic variables are used to predict the risk premia. All of the models are specified in a Bayesian framework to account for estimation risk, and informative prior distributions on both stock returns and model parameters are adopted to reduce estimation error. To gauge the economic signicance of the predictability, we apply the models to the U.S. stock market and construct optimal portfolios based on model predictions. Outofsample performance of the portfolios is evaluated to compare the models. The empirical results confirm predictabiltiy from all of the sources considered in our model: (1) The equity risk premium is timevarying and predictable using macroeconomic variables; (2) Stocks' alphas and betas differ crosssectionally and are predictable using firmlevel characteristics; and (3) Stocks' alphas and betas are also timevarying and predictable using macroeconomic variables. Comparison of different subperiods shows that the predictability of stocks' betas is persistent over time, but the predictability of stocks' alphas and the risk premium has diminished to some extent. The empirical results also suggest that Bayesian statistical techinques, especially the use of informative prior distributions, help reduce model estimation error and result in portfolios that outperform the passive indexing strategy. The findings are robust in the presence of transaction costs.
Show less  Date Issued
 2011
 Identifier
 FSU_migr_etd0526
 Format
 Thesis
 Title
 A Bayesian Semiparametric Joint Model for Longitudinal and Survival Data.
 Creator

Wang, Pengpeng, Slate, Elizabeth H., Bradley, Jonathan R., Wetherby, Amy M., Lin, Lifeng, Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

Many biomedical studies monitor both a longitudinal marker and a survival time on each subject under study. Modeling these two endpoints as joint responses has potential to improve the inference for both. We consider the approach of Brown and Ibrahim (2003) that proposes a Bayesian hierarchical semiparametric joint model. The model links the longitudinal and survival outcomes by incorporating the mean longitudinal trajectory as a predictor for the survival time. The usual parametric mixed...
Show moreMany biomedical studies monitor both a longitudinal marker and a survival time on each subject under study. Modeling these two endpoints as joint responses has potential to improve the inference for both. We consider the approach of Brown and Ibrahim (2003) that proposes a Bayesian hierarchical semiparametric joint model. The model links the longitudinal and survival outcomes by incorporating the mean longitudinal trajectory as a predictor for the survival time. The usual parametric mixed effects model for the longitudinal trajectory is relaxed by using a Dirichlet process prior on the coefficients. A Cox proportional hazards model is then used for the survival time. The complicated joint likelihood increases the computational complexity. We develop a computationally efficient method by using a multivariate loggamma distribution instead of Gaussian distribution to model the data. We use Gibbs sampling combined with Neal's algorithm (2000) and the MetropolisHastings method for inference. Simulation studies illustrate the procedure and compare this loggamma joint model with the Gaussian joint models. We apply this joint modeling method to a human immunodeciency virus (HIV) data and a prostatespecific antigen (PSA) data.
Show less  Date Issued
 2019
 Identifier
 2019_Spring_Wang_fsu_0071E_15120
 Format
 Thesis
 Title
 BAYESIAN SOLUTIONS TO SOME CLASSICAL PROBLEMS OF STATISTICS.
 Creator

PEREIRA, CARLOS ALBERTO DE BRAGANCA., Florida State University
 Abstract/Description

Three of the basic questions of Statistics may be stated as follows: (A) Which portion of the data X is actually informative about the parameter of interest (theta)? (B) How can all the relevant information about (theta) provided by the data X be extracted? (C) What kind of information about (theta) do the data X possess?, The perspective of this dissertation is that of a Bayesian., Chapter I is essentially concerned with question A. The theory of conditional independence is explained and the...
Show moreThree of the basic questions of Statistics may be stated as follows: (A) Which portion of the data X is actually informative about the parameter of interest (theta)? (B) How can all the relevant information about (theta) provided by the data X be extracted? (C) What kind of information about (theta) do the data X possess?, The perspective of this dissertation is that of a Bayesian., Chapter I is essentially concerned with question A. The theory of conditional independence is explained and the relations between ancillarity, sufficiency, and statistical independence are discussed in depth. Some related concepts like specific sufficiency, bounded completeness, and splitting sets are also studied in some details. The language of conditional independence is used in the remaining Chapters., Chapter II deals with question B for the particular problem of analysing categorical data with missing entries. It is demonstrated how a suitably chosen prior for the frequency parameters can streamline the analysis in the presence of missing entries due to nonresponse or other causes. The two cases where the data follow the Multinomial or the Multivariate Hypergeometric model are treated separately. In the first case it is adequate to restrict the prior (for the cell probabilities) to the class of Dirichlet distributions. In the Hypergeometric case it is convenient to select a prior (for the cell population frequencies) from the class of DirichletMultinomial (DM) distributions. The DM distributions are studied in detail., Chapter III is directly related to question C. Conditions on the likelihood function and on the prior distribution are presented in order to assess the effect of the sample on the posterior distribution. More specifically, it is shown that under certain conditions, the larger the observations obtained, the larger (stochastically in terms of the posterior distribution) is the appropriate parameter., Finally, Chapter IV deals with the characterization of distributions in terms of Blackwell comparison of experiments. It is shown that a result (for the Hypergeometric model) obtained in Chapter II is actually a consequence of a property of complete families of distributions.
Show less  Date Issued
 1980, 1980
 Identifier
 AAI8108380, 3084857, FSDT3084857, fsu:74358
 Format
 Document (PDF)
 Title
 Bayesian Tractography Using Geometric Shape Priors.
 Creator

Dong, Xiaoming, Srivastava, Anuj, Klassen, E. (Eric), Wu, Wei, Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

Diffusionweighted image(DWI) and tractography have been developed for decades and are key elements in recent, largescale efforts for mapping the human brain. The two techniques together provide us a unique possibility to access the macroscopic structure and connectivity of the human brain noninvasively and in vivo. The information obtained not only can help visualize brain connectivity and help segment the brain into different functional areas but also provides tools for understanding some...
Show moreDiffusionweighted image(DWI) and tractography have been developed for decades and are key elements in recent, largescale efforts for mapping the human brain. The two techniques together provide us a unique possibility to access the macroscopic structure and connectivity of the human brain noninvasively and in vivo. The information obtained not only can help visualize brain connectivity and help segment the brain into different functional areas but also provides tools for understanding some major cognitive diseases such as multiple sclerosis, schizophrenia, epilepsy, etc. There are lots of efforts have been put into this area. On the one hand, a vast spectrum of tractography algorithms have been developed in recent years, ranging from deterministic approaches through probabilistic methods to global tractography; On the other hand, various mathematical models, such as diffusion tensor, multitensor model, spherical deconvolution, Qball modeling, have been developed to better exploit the acquisition dependent signal of Diffusionweighted image(DWI). Despite considerable progress in this area, current methods still face many challenges, such as sensitive to noise, lots of false positive/negative fibers, incapable of handling complex fiber geometry and expensive computation cost. More importantly, recent researches have shown that, even with highquality data, the results using current tractography methods may not be improved, suggesting that it is unlikely to obtain an anatomically accurate map of the human brain solely based on the diffusion profile. Motivated by these issues, this dissertation develops a global approach that incorporates anatomical validated geometric shape prior when reconstructing neuron fibers. The fiber tracts between regions of interest are initialized and updated via deformations based on gradients of the posterior energy defined in this paper. This energy has contributions from diffusion data, shape prior information, and roughness penalty. The dissertation first describes and demonstrates the proposed method on the 2D dataset and then extends it to 3D Phantom data and the real brain data. The results show that the proposed method is relatively immune to issues such as noise, complicated fiber structure like fiber crossings and kissing, false positive fibers, and achieve more explainable tractography results.
Show less  Date Issued
 2019
 Identifier
 2019_Spring_DONG_fsu_0071E_15144
 Format
 Thesis
 Title
 A Class of MixedDistribution Models with Applications in Financial Data Analysis.
 Creator

Tang, Anqi, Niu, Xufeng, Cheng, Yingmei, Wu, Wei, Huﬀer, Fred, Department of Statistics, Florida State University
 Abstract/Description

Statisticians often encounter data in the form of a combination of discrete and continuous outcomes. A special case is zeroinflated longitudinal data where the response variable has a large portion of zeros. These data exhibit correlation because observations are obtained on the same subjects over time. In this dissertation, we propose a twopart mixed distribution model to model zeroinflated longitudinal data. The first part of the model is a logistic regression model that models the...
Show moreStatisticians often encounter data in the form of a combination of discrete and continuous outcomes. A special case is zeroinflated longitudinal data where the response variable has a large portion of zeros. These data exhibit correlation because observations are obtained on the same subjects over time. In this dissertation, we propose a twopart mixed distribution model to model zeroinflated longitudinal data. The first part of the model is a logistic regression model that models the probability of nonzero response; the other part is a linear model that models the mean response given that the outcomes are not zeros. Random effects with AR(1) covariance structure are introduced into both parts of the model to allow serial correlation and subject specific effect. Estimating the twopart model is challenging because of high dimensional integration necessary to obtain the maximum likelihood estimates. We propose a Monte Carlo EM algorithm for estimating the maximum likelihood estimates of parameters. Through simulation study, we demonstrate the good performance of the MCEM method in parameter and standard error estimation. To illustrate, we apply the twopart model with correlated random effects and the model with autoregressive random effects to executive compensation data to investigate potential determinants of CEO stock option grants.
Show less  Date Issued
 2011
 Identifier
 FSU_migr_etd1710
 Format
 Thesis
 Title
 Comparative mRNA Expression Analysis Leveraging Known Biochemical Interactions.
 Creator

Steppi, Albert Joseph, Zhang, Jinfeng, Sang, QingXiang, Wu, Wei, Niu, Xufeng, Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

We present two studies incorporating existing biological knowledge into differential gene expression analysis that attempt to place the results within a broader biological context. The studies investigate breast cancer health disparity between differing ethnic groups by comparing gene expression levels in tumor samples from patients from different ethnic populations. We incorporate existing knowledge by making comparisons not just between individual genes, but between sets of related genes...
Show moreWe present two studies incorporating existing biological knowledge into differential gene expression analysis that attempt to place the results within a broader biological context. The studies investigate breast cancer health disparity between differing ethnic groups by comparing gene expression levels in tumor samples from patients from different ethnic populations. We incorporate existing knowledge by making comparisons not just between individual genes, but between sets of related genes and networks of interacting genes. In the first study, a comparison is made between mRNA expression patterns in Asian and Caucasian American breast cancer samples in an attempt to better understand why there are significantly lower breast cancer incidence and mortality rates in Asian Americans compared to Caucasian Americans. In the second study, the expression levels of genes related to drug and xenobiotic metabolizing enzymes (DXME) are compared between African, Asian, and Caucasian American breast cancer patients. The expression of genes related to these enzymes has been found to significantly affect drug clearance and the onset of drug resistance. Both studies found differentially expressed genes and pathways that may be associated with health disparities between the three ethnic populations. A thorough investigation of the literature was made in order to understand the context in which these differences in gene expression could affect the development and progression of breast tumors, and to identify genes and pathways that may be differentially expressed between the ethnic groups in general but not associated with breast cancer. Many of the relevant differences in gene expression were found to be linked to factors such as diet and differences in body composition. The process of finding relevant pathways and sets of interacting genes to inform comparative mRNA expression analysis can be laborious and time consuming. The literature is expanding at an exponential rate, and there is little hope for research groups to be able to keep up with all of the latest research. It is becoming more common for journals to require authors to make their results available in public databases, but many results concerning biochemical interactions are only accessible in unstructured text. Extracting relationships and interactions from the biological literature using techniques from machine learning and natural language processing is an important and growing field of research. To gain a better understanding of this field, we participated in the BioCreative VI Track 4 challenge, which involved classifying PubMed abstracts that contain examples of proteinprotein interactions that are affected by a mutation. We discuss the model we developed and the lessons learned while participating in the competition. The problem of acquiring sufficient quantities of quality labeled data is a great obstacle preventing the improvement of performance. We present a web application we are developing to streamline the annotation of entityentity interactions in text. It makes use of a database of known interactions to locate passages that are likely to be relevant and offers a simple and concise user interface to minimize the cognitive burden on the annotator.
Show less  Date Issued
 2018
 Identifier
 2018_Sp_Steppi_fsu_0071E_14522
 Format
 Thesis
 Title
 A Comparison of Estimators in Hierarchical Linear Modeling: Restricted Maximum Likelihood versus Bootstrap via Minimum Norm Quadratic Unbiased Estimators.
 Creator

Delpish, Ayesha Nneka, Niu, XuFeng, Tate, Richard L., Huﬀer, Fred W., Zahn, Douglas, Department of Statistics, Florida State University
 Abstract/Description

The purpose of the study was to investigate the relative performance of two estimation procedures, the restricted maximum likelihood (REML) and the bootstrap via MINQUE, for a twolevel hierarchical linear model under a variety of conditions. Specific focus lay on observing whether the bootstrap via MINQUE procedure offered improved accuracy in the estimation of the model parameters and their standard errors in situations where normality may not be guaranteed. Through Monte Carlo simulations,...
Show moreThe purpose of the study was to investigate the relative performance of two estimation procedures, the restricted maximum likelihood (REML) and the bootstrap via MINQUE, for a twolevel hierarchical linear model under a variety of conditions. Specific focus lay on observing whether the bootstrap via MINQUE procedure offered improved accuracy in the estimation of the model parameters and their standard errors in situations where normality may not be guaranteed. Through Monte Carlo simulations, the importance of this assumption for the accuracy of multilevel parameter estimates and their standard errors was assessed using the accuracy index of relative bias and by observing the coverage percentages of 95% confidence intervals constructed for both estimation procedures. The study systematically varied the number of groups at level2 (30 versus 100), the size of the intraclass correlation (0.01 versus 0.20) and the distribution of the observations (normal versus chisquared with 1 degree of freedom). The number of groups and intraclass correlation factors produced effects consistent with those previously reported—as the number of groups increased, the bias in the parameter estimates decreased, with a more significant effect observed for those estimates obtained via REML. High levels of the intraclass correlation also led to a decrease in the efficiency of parameter estimation under both methods. Study results show that while both the restricted maximum likelihood and the bootstrap via MINQUE estimates of the fixed effects were accurate, the efficiency of the estimates was affected by the distribution of errors with the bootstrap via MINQUE procedure outperforming the REML. Both procedures produced less efficient estimators under the chisquared distribution, particularly for the variancecovariance component estimates.
Show less  Date Issued
 2006
 Identifier
 FSU_migr_etd0771
 Format
 Thesis
 Title
 A comparison of robust and least squares regression models using actual and simulated data.
 Creator

Gilbert, Scott Alan., Florida State University
 Abstract/Description

The purpose of this study was to compare several robust regression techniques to ordinary least squares (OLS) regression when analyzing bivariate and multivariate data. The bivariate analysis compared of the performance of alternative robust procedures in regard to the detection of outliers versus the standard OLS regression techniques. The bivariate analysis demonstrated the weaknesses of OLS regression and the standard OLS outlier diagnostic techniques when multiple outliers are present. In...
Show moreThe purpose of this study was to compare several robust regression techniques to ordinary least squares (OLS) regression when analyzing bivariate and multivariate data. The bivariate analysis compared of the performance of alternative robust procedures in regard to the detection of outliers versus the standard OLS regression techniques. The bivariate analysis demonstrated the weaknesses of OLS regression and the standard OLS outlier diagnostic techniques when multiple outliers are present. In addition, this research assessed the empirical performance of alpha and power under three nonnormal probability density functions using a Monte Carlo simulation., The first analysis focused on several bivariate data sets. Each data set was plotted and each of the regression models used to analyze the data. The usual results (e.g., R$\sp2$, regression coefficients, standard errors, and regression diagnostics) were examined to give a visual as well as empirical analysis of the models' performance in the presence of multiple outliers., The second component of this study entailed a Monte Carlo simulation of five robust regression models and OLS regression under four probability density functions. The variables included in the study were placed in one 2$\sp1$3$\sp2$ and two 3$\sp2$ factorial design repeated over four probability density functions, resulting in a total of 90 experimental runs of the Monte Carlo simulation. Random samples were generated and then transformed to fit desired distributional moment characteristics. The incremental null hypothesis was used as the basis to calculate empirical alpha and power values calculated., The analysis demonstrated the inadequacies of the standard OLS based outlier detection methods and explained how regression analysis could be improved if a robust regression method is used in parallel with OLS regression. The multivariate analysis demonstrated the robustness of the OLS regression model to three nonnormal populations. It further demonstrated a moderate inflation of alpha for the Mclass of robust regression model and a lack of power stability with the rank transform regression method., Based on the results of this study, recommendations were made for using robust regression methods and suggestions for future research offered.
Show less  Date Issued
 1992, 1992
 Identifier
 AAI9222385, 3087822, FSDT3087822, fsu:76632
 Format
 Document (PDF)
 Title
 THE COMPARISON OF SENSITIVITIES OF EXPERIMENTS (MAXIMUM LIKELIHOOD, RANDOM, FIXED, ANALYSIS OF VARIANCE).
 Creator

YOUNG, BARBARA NELSON., Florida State University
 Abstract/Description

The sensitivity of a measurement technique is defined to be its ability to detect differences among the treatments in a fixed effects design, or the presence of a between treatments component of variance in a random effects design. Consider an experiment, consisting of two identical subexperiments, designed specifically for the purpose of comparing two measurement techniques. It is assumed that the techniques of analysis of variance are applicable in analyzing the data obtained from the two...
Show moreThe sensitivity of a measurement technique is defined to be its ability to detect differences among the treatments in a fixed effects design, or the presence of a between treatments component of variance in a random effects design. Consider an experiment, consisting of two identical subexperiments, designed specifically for the purpose of comparing two measurement techniques. It is assumed that the techniques of analysis of variance are applicable in analyzing the data obtained from the two measurement techniques. The subexperiments may have either fixed or random treatment effects in either oneway or general block designs. It is assumed that the experiment yields bivariate observations from the two measurement methods which may or may not be independent. Likelihood ratio tests are used in the various settings of this dissertation to both extend current techniques and provide alternative methods for comparing the sensitivities of experiments.
Show less  Date Issued
 1985, 1985
 Identifier
 AAI8524629, 3086182, FSDT3086182, fsu:75665
 Format
 Document (PDF)
 Title
 A Comparison of Three Approaches to Confidence Interval Estimation for Coefficient Omega.
 Creator

Xu, Jie, Yang, Yanyun, Becker, Betsy Jane, Almond, Russell G., Florida State University, College of Education, Department of Educational Psychology and Learning Systems
 Abstract/Description

Coefficient Omega was introduced by McDonald (1978) as a reliability coefficient of composite scores for the congeneric model. Interval estimation (Neyman, 1937) on coefficient Omega provides a range of plausible values which is likely to capture the population reliability of composite scores. The Wald method, likelihood method, and biascorrected and accelerated bootstrap method are three methods to construct confidence interval for coefficient Omega (e.g., Cheung, 2009b; Kelley & Cheng,...
Show moreCoefficient Omega was introduced by McDonald (1978) as a reliability coefficient of composite scores for the congeneric model. Interval estimation (Neyman, 1937) on coefficient Omega provides a range of plausible values which is likely to capture the population reliability of composite scores. The Wald method, likelihood method, and biascorrected and accelerated bootstrap method are three methods to construct confidence interval for coefficient Omega (e.g., Cheung, 2009b; Kelley & Cheng, 2012; Raykov, 2002, 2004, 2009; Raykov & Marcoulides, 2004; Padilla & Divers, 2013). Very limited number of studies on the evaluation of these three methods can be found in the literature (e.g., Cheung, 2007, 2009a, 2009b; Kelley & Cheng, 2012; Padilla & Divers, 2013). No simulation study has been conducted to evaluate the performance of these three methods for interval construction on coefficient Omega. In the current simulation study, I assessed these three methods by comparing their empirical performance on interval estimation for coefficient Omega. Four factors were included in the simulation design: sample size, number of items, factor loading, and degree of nonnormality. Two thousands datasets were generated in R 2.15.0 (R Core Team, 2012) for each condition. For each generated dataset, three approaches (i.e., the Wald method, likelihood method, and biascorrected and accelerated bootstrap method) were used to construct 95% confidence interval of coefficient Omega in R 2.15.0. The results showed that when the data were multivariate normally distributed, three methods performed equally well and coverage probabilities were very close to the prespecified .95 confidence level. When the data were multivariate nonnormally distributed, coverage probabilities decreased and interval widths became wider for all three methods as the degree of nonnormality increased. In general, when the data departed from the multivariate normality, the BCa bootstrap method performed better than the other two methods, with relatively higher coverage probabilities, while the Wald and likelihood methods were comparable and yielded narrower interval width than the BCa bootstrap method.
Show less  Date Issued
 2014
 Identifier
 FSU_migr_etd9269
 Format
 Thesis
 Title
 A comparison of two methods of bootstrapping in a reliability model.
 Creator

Chiang, YuangChin., Florida State University
 Abstract/Description

We consider bootstrapping in the following reliability model which was considered by Doss, Freitag, and Proschan (1987). Available for testing is a sample of iid systems each having the same structure of m independent components. Each system is continuously observed until it fails. For every component in each system, either a failure time or a censoring time is recorded. A failure time is recorded if the component fails before or at the time of system failure; otherwise a censoring time is...
Show moreWe consider bootstrapping in the following reliability model which was considered by Doss, Freitag, and Proschan (1987). Available for testing is a sample of iid systems each having the same structure of m independent components. Each system is continuously observed until it fails. For every component in each system, either a failure time or a censoring time is recorded. A failure time is recorded if the component fails before or at the time of system failure; otherwise a censoring time is recorded. To estimate the distribution of the component lifelengths F$\sb1,\...$,F$\sb{\rm m}$, one can formally compute the KaplanMeier estimates F$\sb1,\...$,F$\sb{\rm m}$. Various quantities of interest, such as the probability that a new system will survive time t$\sb0$, may then be estimated by combining F$\sb1,\...$,F$\sb{\rm m}$ in a suitable way. In this model, bootstrapping can be carried out in two different ways. One can resample n systems at random from the original n systems. Alternatively, one can construct artificial systems by generating independent random lifelengths from the KaplanMeier estimates F$\sb{\rm j}$, and from those form artificial data. The two methods are distinct. We show that asymptotically, bootstrapping by either method yields correct answers. We also compare the two methods via simulation studies.
Show less  Date Issued
 1988, 1988
 Identifier
 AAI8906216, 3161719, FSDT3161719, fsu:77918
 Format
 Document (PDF)
 Title
 The computation of probabilities which involve spacings, with applications to the scan statistic.
 Creator

Lin, ChienTai., Florida State University
 Abstract/Description

We develop a methodology for evaluating probabilities which involve linear combinations of spacings and then present some applications of this methodology. The basic idea underlying our method was given by Huffer (1988): A recursion is used to break up the joint distribution of several linear combinations of spacings into a sum of simpler components. The same recursion is then applied to each of these components and so on. The process is continued until we obtain components which are simple...
Show moreWe develop a methodology for evaluating probabilities which involve linear combinations of spacings and then present some applications of this methodology. The basic idea underlying our method was given by Huffer (1988): A recursion is used to break up the joint distribution of several linear combinations of spacings into a sum of simpler components. The same recursion is then applied to each of these components and so on. The process is continued until we obtain components which are simple and easily expressed in closed form. We describe algorithms and a computer program (written in C) which implement this approach. Our approach has two advantages. First, it is fairly general and can be used to solve a variety of problems involving linear combinations of spacings. Secondly, because the output of our procedure is a polynomial whose coefficients are computed exactly, we can supply numerical answers which are accurate to any required degree of precision. We apply our program to compute the distribution of the scan statistic for small sample sizes. We also use the recursion and computer program to calculate the lower order moments of the number of clumps in randomly distributed points. We can use these moments to obtain bounds and approximations for the distribution of the scan statistic. Our approximations are based on fitting a compound Poisson distribution to the moments of the number of clumps.
Show less  Date Issued
 1993, 1993
 Identifier
 AAI9416150, 3088291, FSDT3088291, fsu:77095
 Format
 Document (PDF)
 Title
 Conditional bootstrap methods for censored data.
 Creator

Kim, JiHyun., Florida State University
 Abstract/Description

We first consider the random censorship model of survival analysis. The pairs of positive random variables ($X\sb{i},Y\sb{i}$), i = 1,$\...$,n, are independent and identically distributed, with distribution functions F(t) = P($X\sb{i} \leq\ t$) and G(t) = P($Y\sb{i} \leq\ t$) and the Y's are independent of the X's. We observe only ($T\sb{i},\delta\sb{i}$), i = 1,$\...$,n, where $T\sb{i}$ = min($X\sb{i},Y\sb{i}$) and $\delta\sb{i}$ = I($X\sb{i} \leq\ Y\sb{i}$). The X's represent survival times...
Show moreWe first consider the random censorship model of survival analysis. The pairs of positive random variables ($X\sb{i},Y\sb{i}$), i = 1,$\...$,n, are independent and identically distributed, with distribution functions F(t) = P($X\sb{i} \leq\ t$) and G(t) = P($Y\sb{i} \leq\ t$) and the Y's are independent of the X's. We observe only ($T\sb{i},\delta\sb{i}$), i = 1,$\...$,n, where $T\sb{i}$ = min($X\sb{i},Y\sb{i}$) and $\delta\sb{i}$ = I($X\sb{i} \leq\ Y\sb{i}$). The X's represent survival times, the Y's represent censoring times. Efron (1981) proposed two bootstrap methods for the random censorship model and showed that they are distributionally the same. Akritas (1986) established the weak convergence of the bootstrapped KaplanMeier estimator of F when bootstrapping is done by this method. Let us now consider bootstrapping more closely. Suppose that we wish to estimate the variance of F(t). If we knew the Y's then we would condition on them by the ancillarity principle, since the distribution of the Y's does not depend on F. That is, we would want to estimate Var$\{$F(t)$\vert Y\sb1,\...,Y\sb{n}\}$. Unfortunately, in the random censorship model we do not see all the Y's. If $\delta\sb{i}$ = 0 we see the exact value of $Y\sb{i}$, but if $\delta\sb{i}$ = 1 we know only that $Y\sb{i} > T\sb{i}$. Let us denote this information on the Y's by ${\cal C}$. Thus, what we want to estimate is Var$\{$F(t)$\vert{\cal C}\}$. Efron's scheme is appropriate for estimating the unconditional variance. We propose a new bootstrap method which provides an estimate of Var$\{$F(t)$\vert{\cal C}\}$., In this research we show that the KaplanMeier estimator of F formed by the new bootstrap method has the same limiting distribution as the one by Efron's approach. The results of simulation studies assessing the small sample performance of the two bootstrap methods are reported. We also consider the model in which the $X\sb{i}$'s are censored by the $Y\sb{i}$'s and also by known fixed constants, and propose an appropriate bootstrap method for that model. This bootstrap method is a readily modified version of the new bootstrap method above.
Show less  Date Issued
 1990, 1990
 Identifier
 AAI9113938, 3162201, FSDT3162201, fsu:78399
 Format
 Document (PDF)
 Title
 Contributions to the theory of arrangement increasing functions.
 Creator

Proschan, Michael Arthur., Florida State University
 Abstract/Description

A function $f(\underline{x})$ which increases each time we transpose an out of order pair of coordinates, $x\sb{j} > x\sb{k}$ for some $j x\sb{k}$ by transposing the two x coordinates. The theory of AI functions is tailor made for ranking and selection problems, in which case we assume that the density $f(\underline{\theta}$,$\underline{x})$ of observations with respective parameters $\theta\sb1, \..., \theta\sb{n}$ is AI, and the goal is to determine the largest or smallest parameters., In...
Show moreA function $f(\underline{x})$ which increases each time we transpose an out of order pair of coordinates, $x\sb{j} > x\sb{k}$ for some $j x\sb{k}$ by transposing the two x coordinates. The theory of AI functions is tailor made for ranking and selection problems, in which case we assume that the density $f(\underline{\theta}$,$\underline{x})$ of observations with respective parameters $\theta\sb1, \..., \theta\sb{n}$ is AI, and the goal is to determine the largest or smallest parameters., In this dissertation we present new applications of AI functions in such areas as biology and reliability, and we generalize the notion of AI functions. We consider multivector extensions, some with and one without respect to parameter vectors, and we connect these. Another generalization (TEGO) is motivated by the connection between total positivity (TP) and AI. TEGO results are shown to imply AI and TP results. We also define and develop a partial ordering on densities of rank vectors. The theory, which involves finding the extreme points of the convex set of AI rank densities, is then used to establish some power results of rank tests.
Show less  Date Issued
 1989, 1989
 Identifier
 AAI9002934, 3161869, FSDT3161869, fsu:78068
 Format
 Document (PDF)
 Title
 Covariance on Manifolds.
 Creator

Balov, Nikolay H. (Nikolay Hristov), Srivastava, Anuj, Klassen, Eric, Patrangenaru, Victor, McGee, Daniel, Department of Statistics, Florida State University
 Abstract/Description

With ever increasing complexity of observational and theoretical data models, the sufficiency of the classical statistical techniques, designed to be applied only on vector quantities, is being challenged. Nonlinear statistical analysis has become an area of intensive research in recent years. Despite the impressive progress in this direction, a unified and consistent framework has not been reached. In this regard, the following work is an attempt to improve our understanding of random...
Show moreWith ever increasing complexity of observational and theoretical data models, the sufficiency of the classical statistical techniques, designed to be applied only on vector quantities, is being challenged. Nonlinear statistical analysis has become an area of intensive research in recent years. Despite the impressive progress in this direction, a unified and consistent framework has not been reached. In this regard, the following work is an attempt to improve our understanding of random phenomena on nonEuclidean spaces. More specifically, the motivating goal of the present dissertation is to generalize the notion of distribution covariance, which in standard settings is defined only in Euclidean spaces, on arbitrary manifolds with metric. We introduce a tensor field structure, named covariance field, that is consistent with the heterogeneous nature of manifolds. It not only describes the variability imposed by a probability distribution but also provides alternative distribution representations. The covariance field combines the distribution density with geometric characteristics of its domain and thus fills the gap between these two.We present some of the properties of the covariance fields and argue that they can be successfully applied to various statistical problems. In particular, we provide a systematic approach for defining parametric families of probability distributions on manifolds, parameter estimation for regression analysis, nonparametric statistical tests for comparing probability distributions and interpolation between such distributions. We then present several application areas where this new theory may have potential impact. One of them is the branch of directional statistics, with domain of influence ranging from geosciences to medical image analysis. The fundamental level at which the covariance based structures are introduced, also opens a new area for future research.
Show less  Date Issued
 2009
 Identifier
 FSU_migr_etd1045
 Format
 Thesis
 Title
 Critical Issues in Survey MetaAnalysis.
 Creator

Gozutok, Ahmet Serhat, Becker, Betsy Jane, Huffer, Fred W., Yang, Yanyun, Paek, Insu, Florida State University, College of Education, Department of Educational Psychology and...
Show moreGozutok, Ahmet Serhat, Becker, Betsy Jane, Huffer, Fred W., Yang, Yanyun, Paek, Insu, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

In research synthesis, researchers may aim at summarizing peoples' attitudes and perceptions of phenomena that have been assessed using different measures. Selfreport rating scales are among the most commonly used measurement tools to quantify such latent constructs in education and psychology. However, selfreport ratingscale questions measuring the same construct may differ from each other in many ways. Scale format, number of response options, wording of questions, and labeling of...
Show moreIn research synthesis, researchers may aim at summarizing peoples' attitudes and perceptions of phenomena that have been assessed using different measures. Selfreport rating scales are among the most commonly used measurement tools to quantify such latent constructs in education and psychology. However, selfreport ratingscale questions measuring the same construct may differ from each other in many ways. Scale format, number of response options, wording of questions, and labeling of response option categories may vary across questions. Consequently, variations across the measures of the same construct bring about the issue of comparability of the results across the studies in metaanalytic investigations. In this study, I examine the complexities of summarizing the results of different survey questions about the same construct in the metaanalytic fashion. More specifically, this study focuses on the practical problems that arise when combining survey items that differ from one another in the wording of question stems, numbers of response option categories, scale direction (i.e., unipolar and bipolar scales), response scale labeling (i.e., fullylabeled scales and endpointslabeled scales), and responseoption labeling (e.g., "extremely happy"  "completely happy"  "most happy", "pretty happy", "quite happy" "moderately happy", and "not at all happy"  "least happy"  "most unhappy"). In addition, I propose practical solutions to handle the issues that arise due to such variations when conducting a metaanalysis. I discuss the implications of the proposed solutions from the perspective of metaanalysis. Examples are obtained from the collection of studies in the World Happiness Database (Veenhoven, 2006), which includes various singleitem happiness measures.
Show less  Date Issued
 2018
 Identifier
 2018_Fall_Gozutok_fsu_0071E_14866
 Format
 Thesis
 Title
 Cumulative regression function methods in survival analysis and time series.
 Creator

Zhang, MeiJie., Florida State University
 Abstract/Description

One may estimate a conditional hazard function from grouped (and possibly censored) survival data by the time and covariate specific occurrence/exposure rate. Asymptotic results for cumulative versions of this estimator are developed, utilizing the general framework of counting processes. In particular, a grouped data based goodnessoffit test for Cox's proportional hazard model is given. Various constraints on the asymptotic behavior of the widths of the calendar periods and covariate...
Show moreOne may estimate a conditional hazard function from grouped (and possibly censored) survival data by the time and covariate specific occurrence/exposure rate. Asymptotic results for cumulative versions of this estimator are developed, utilizing the general framework of counting processes. In particular, a grouped data based goodnessoffit test for Cox's proportional hazard model is given. Various constraints on the asymptotic behavior of the widths of the calendar periods and covariate strata employed in grouping the data are needed to prove the results. Actual performance of the estimators and test statistics is evaluated by Monte Carlo methods., We also consider the problem of identifying the class of time series model to which a series belongs based on observation of part of the series. Techniques of nonparametric estimation have been applied to this problem by Auestad and Tjostheim (Biometrika 77(1990):669687) who used kernel estimates of the onestep lagged conditional mean and variance functions. We study cumulative versions of such estimates. These are more stable than the kernel estimates and can be used to construct confidence bands for the underlying cumulative mean and variance functions. Goodnessoffit tests for specific parametric models are also developed.
Show less  Date Issued
 1991, 1991
 Identifier
 AAI9202323, 3087663, FSDT3087663, fsu:76478
 Format
 Document (PDF)