 Title
 Inference for Semiparametric TimeVarying Covariate Effect Relative Risk Regression Models.
 Creator

Ye, Gang, McKeague, Ian W., Wang, Xiaoming, Huffer, Fred W., Song, KaiSheng, Department of Statistics, Florida State University
 Abstract/Description

A major interest of survival analysis is to assess covariate effects on survival via appropriate conditional hazard function regression models. The Cox proportional hazards model, which assumes an exponential form for the relative risk, has been a popular choice. However, other regression forms such as Aalen's additive risk model may be more appropriate in some applications. In addition, covariate effects may depend on time, which can not be reflected by a Cox proportional hazards model. In...
A major interest of survival analysis is to assess covariate effects on survival via appropriate conditional hazard function regression models. The Cox proportional hazards model, which assumes an exponential form for the relative risk, has been a popular choice. However, other regression forms such as Aalen's additive risk model may be more appropriate in some applications. In addition, covariate effects may depend on time, which can not be reflected by a Cox proportional hazards model. In this dissertation, we study a class of timevarying covariate effect regression models in which the link function (relative risk function) is a twice continuously differentiable and prespecified, but otherwise general given function. This is a natural extension of the PrenticeSelf model, in which the link function is general but covariate effects are modelled to be time invariant. In the first part of the dissertation, we focus on estimating the cumulative or integrated covariate effects. The standard martingale approach based on counting processes is utilized to derive a likelihoodbased iterating equation. An estimator for the cumulative covariate effect that is generated from the iterating equation is shown to be ¡Ìnconsistent. Asymptotic normality of the estimator is also demonstrated. Another aspect of the dissertation is to investigate a new test for the above timevarying covariate effect regression model and study consistency of the test based on martingale residuals. For Aalen's additive risk model, we introduce a test statistic based on the HufferMcKeague weightedleastsquares estimator and show its consistency against some alternatives. An alternative way to construct a test statistic based on Bayesian Bootstrap simulation is introduced. An application to real lifetime data will be presented.
Date Issued
 2005
 Identifier
 FSU_migr_etd0949
 Format
 Thesis
 Title
 Predictive Accuracy Measures for Binary Outcomes: Impact of Incidence Rate and Optimization Techniques.
 Creator

Scolnik, Ryan, McGee, Daniel, Slate, Elizabeth H., Eberstein, Isaac W., Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences, Department of Statistics
Show moreScolnik, Ryan, McGee, Daniel, Slate, Elizabeth H., Eberstein, Isaac W., Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences, Department of Statistics
Show less  Abstract/Description

Evaluating the performance of models predicting a binary outcome can be done using a variety of measures. While some measures intend to describe the model's overall fit, others more accurately describe the model's ability to discriminate between the two outcomes. If a model fits well but doesn't discriminate well, what does that tell us? Given two models, if one discriminates well but has poor fit while the other fits well but discriminates poorly, which of the two should we choose? The...
Evaluating the performance of models predicting a binary outcome can be done using a variety of measures. While some measures intend to describe the model's overall fit, others more accurately describe the model's ability to discriminate between the two outcomes. If a model fits well but doesn't discriminate well, what does that tell us? Given two models, if one discriminates well but has poor fit while the other fits well but discriminates poorly, which of the two should we choose? The measures of interest for our research include the area under the ROC curve, Brier Score, discrimination slope, LogLoss, Rsquared and Fscore. To examine the underlying relationships among all of the measures, real data and simulation studies are used. The real data comes from multiple cardiovascular research studies and the simulation studies are run under general conditions and also for incidence rates ranging from 2% to 50%. The results of these analyses provide insight into the relationships among the measures and raise concern for scenarios when the measures may yield different conclusions. The impact of incidence rate on the relationships provides a basis for exploring alternative maximization routines to logistic regression. While most of the measures are easily optimized using the NewtonRaphson algorithm, the maximization of the area under the ROC curve requires optimization of a nonlinear, nondifferentiable function. Usage of the NelderMead simplex algorithm and close connections to economics research yield unique parameter estimates and general asymptotic conditions. Using real and simulated data to compare optimizing the area under the ROC curve to logistic regression further reveals the impact of incidence rate on the relationships, significant increases in achievable areas under the ROC curve, and differences in conclusions about including a variable in a model.
Date Issued
 2016
 Identifier
 FSU_2016SP_Scolnik_fsu_0071E_13146
 Format
 Thesis
 Title
 A Comparison of Estimators in Hierarchical Linear Modeling: Restricted Maximum Likelihood versus Bootstrap via Minimum Norm Quadratic Unbiased Estimators.
 Creator

Delpish, Ayesha Nneka, Niu, XuFeng, Tate, Richard L., Huﬀer, Fred W., Zahn, Douglas, Department of Statistics, Florida State University
 Abstract/Description

The purpose of the study was to investigate the relative performance of two estimation procedures, the restricted maximum likelihood (REML) and the bootstrap via MINQUE, for a twolevel hierarchical linear model under a variety of conditions. Specific focus lay on observing whether the bootstrap via MINQUE procedure offered improved accuracy in the estimation of the model parameters and their standard errors in situations where normality may not be guaranteed. Through Monte Carlo simulations,...
The purpose of the study was to investigate the relative performance of two estimation procedures, the restricted maximum likelihood (REML) and the bootstrap via MINQUE, for a twolevel hierarchical linear model under a variety of conditions. Specific focus lay on observing whether the bootstrap via MINQUE procedure offered improved accuracy in the estimation of the model parameters and their standard errors in situations where normality may not be guaranteed. Through Monte Carlo simulations, the importance of this assumption for the accuracy of multilevel parameter estimates and their standard errors was assessed using the accuracy index of relative bias and by observing the coverage percentages of 95% confidence intervals constructed for both estimation procedures. The study systematically varied the number of groups at level2 (30 versus 100), the size of the intraclass correlation (0.01 versus 0.20) and the distribution of the observations (normal versus chisquared with 1 degree of freedom). The number of groups and intraclass correlation factors produced effects consistent with those previously reported—as the number of groups increased, the bias in the parameter estimates decreased, with a more significant effect observed for those estimates obtained via REML. High levels of the intraclass correlation also led to a decrease in the efficiency of parameter estimation under both methods. Study results show that while both the restricted maximum likelihood and the bootstrap via MINQUE estimates of the fixed effects were accurate, the efficiency of the estimates was affected by the distribution of errors with the bootstrap via MINQUE procedure outperforming the REML. Both procedures produced less efficient estimators under the chisquared distribution, particularly for the variancecovariance component estimates.
Date Issued
 2006
 Identifier
 FSU_migr_etd0771
 Format
 Thesis
 Title
 Assessing the Shelf Life of Retail Shrimp Using RealTime Microrespirometer.
 Creator

Alderees, Fahad, Hsieh, YunHwa Peggy, Arjmandi, Bahram, Huffer, Fred W., Department of Nutrition, Food, and Exercise Science, Florida State University
 Abstract/Description

Shrimp is the most consumed seafood item in the United States (U.S.). Currently 90% of the shrimp consumed in the U.S. is imported from a few Asian countries. When imported shrimp arrives to its destination, it probably contains a load of microbial contamination due to the postharvest processing steps such as transportation, handling, preparation, beheading, peeling, deveining, packaging and storage that could add further bacterial contamination. Most of the U.S. import refusals belong to...
Shrimp is the most consumed seafood item in the United States (U.S.). Currently 90% of the shrimp consumed in the U.S. is imported from a few Asian countries. When imported shrimp arrives to its destination, it probably contains a load of microbial contamination due to the postharvest processing steps such as transportation, handling, preparation, beheading, peeling, deveining, packaging and storage that could add further bacterial contamination. Most of the U.S. import refusals belong to seafood shipments due to the detection of bacterial contamination and filthy appearance. Upon shipment arrival, testing for microbial activities of seafood requires a two day incubation period when using the traditional Aerobic Plate Count (APC) method; however, a novel noninstrumental microrespirometer which was developed by Hsieh and Hsieh (2000) can determine the microbial activity of the sample in realtime by measuring the CO2 evolution rate (CER). CO2 is a byproduct of microbial respiration which can be used as a direct indicator of biological activity. The unique characteristic of this method is that it is a simple device that can determine the microbial activity in food less than one hour and is highly sensitive in determining the CER and simple to operate. The use of the microrespirometer instead of the APC in testing the imported seafood shipments will save a great deal of time and lower the cost for both importers and exporters by lowering the testing cost and reducing the costly waiting time at the ports. The specific objectives of this study are: 1) to validate the realtime microrespirometer method by correlating the rapid CER results with the traditional cultural APC method, 2) to establish a shrimp spoilage cutoff value of CER using the microrespirometer method by comparing the results with sensory analysis, 3) to exam the effect of chloramphenicol on shrimp shelf life using noninstrumental microrespirometer, APC method and sensory analysis and 4) to compare the shelf life of farmraised imported shrimp with domestic wildcaught shrimp using noninstrumental microrespirometer, APC, pH and sensory analysis. Frozen domestic wildcaught shrimp (Penaeus duorarum) and imported farmraised shrimp (Panaeus vannamei) were purchased locally. Domestic shrimp were treated with chloramohenicol at 10 and 30 ppm and stored at 4°C along with the untreated domestic and imported shrimp. Samples were tested daily using the microrespirometer, APC, pH and olfactory sensory analysis. The p values and correlations between CER, APC and sensory analysis were determined using SPSS Statistic software and Microsoft Excel 2007. The microrespirometer and pH determinations were done in triplicate; the APC was performed in duplicate and the experiments were repeated twice. The CER method was found to be highly correlated with the APC (R²=0.812 to 0.929) for all samples stored at 4°C. When samples' spoilage odor became noticeable, the average CER value of all samples was 27.23 µl/h/g. In order to allow for a small safe margin, a CER value of 25 µl/h/g was identified as a safe cutoff value for raw shrimp stored at 4°C. Samples treated with chloramphenicol had significant (P The difference in microbial quality and shelf life of various (source of origin and drug treatment) shrimp samples were able to be determined rapidly and accurately when using the realtime CER method.
Date Issued
 2010
 Identifier
 FSU_migr_etd0160
 Format
 Thesis
 Title
 Essays on the Role of Trade Frictions in International Economics.
 Creator

Yoshimine, Koichi, Norrbin, Stefan C., Huﬀer, Fred W., Beaumont, Paul M., Garriga, Carlos, Department of Economics, Florida State University
 Abstract/Description

This dissertation consists of three essays. The first essay examines the effects of tax differentials on the trade balance across countries. Given that intrafirm trade accounts for the sizable share of the world's international trade, it is expected that incomeshifting activities of multinational firms can bias the trade balance in many countries. Specifically, an increase in the relative tax liability in one country is expected to decrease the trade balance of that country. Using proxies to...
This dissertation consists of three essays. The first essay examines the effects of tax differentials on the trade balance across countries. Given that intrafirm trade accounts for the sizable share of the world's international trade, it is expected that incomeshifting activities of multinational firms can bias the trade balance in many countries. Specifically, an increase in the relative tax liability in one country is expected to decrease the trade balance of that country. Using proxies to the effective tax liability of 19 OECD countries, the cointegrating regressions show significantly negative relationships between tax differentials and the trade balance among relatively small industrial countries. The second essay asks whether the empirically observed home biases in international trade are accounted for by a theoretical model. It has been pointed out that trade among individual Canadian provinces is much larger than the trade between individual Canadian provinces and individual U.S. states. There is a similar tendency in the trade among the OECD member countries. Obstfeld and Rogoff (2000) claim that such a bias can be explained if one takes into account the interaction between transaction costs and the elasticity of substitution. This study tests their claim using a dynamic general equilibrium model where agents pay proportional transaction costs. The simulation results show that the bias levels generated by the plausible values for transaction cost and elasticity are not particularly inconsistent with the observed levels in the US  Canada relationship. The third essay tests a version of international real business cycle model aimed at examining the effect on the exchangerate volatility of market segmentation generated by a trade friction across countries. Obstfeld and Rogoff (2000) argue that segmentation in international goods market can explain the empirically observed real exchangerate volatility. In this study, a trade cost in goods market combined with income heterogeneity of consumers endogenously generates market segmentation by preventing a fraction of consumers from participating in international trade. Under such a circumstance, the volatility of exchange rate actually rises, but the volatility is still below the observed reality, suggesting that trade cost alone cannot explain the anomalous exchangerate behaviors.
Date Issued
 2004
 Identifier
 FSU_migr_etd0867
 Format
 Thesis
 Title
 Flexible Additive Risk Models Using Piecewise Constant Hazard Functions.
 Creator

Uhm, Daiho, Huﬀer, Fred W., Kercheval, Alec, McGee, Dan, Niu, Xufeng, Department of Statistics, Florida State University
 Abstract/Description

We study a weighted least squares (WLS) estimator for Aalen's additive risk model which allows for a very flexible handling of covariates. We divide the followup period into intervals and assume a constant hazard rate in each interval. The model is motivated as a piecewise approximation of a hazard function composed of three parts: arbitrary nonparametric functions for some covariate effects, smoothly varying functions for others, and known (or constant) functions for yet others. The...
We study a weighted least squares (WLS) estimator for Aalen's additive risk model which allows for a very flexible handling of covariates. We divide the followup period into intervals and assume a constant hazard rate in each interval. The model is motivated as a piecewise approximation of a hazard function composed of three parts: arbitrary nonparametric functions for some covariate effects, smoothly varying functions for others, and known (or constant) functions for yet others. The proposed estimator is an extension of the grouped data version of the HufferMcKeague estimator (1991). Our estimator may also be regarded as a piecewise constant analog of the semiparametric estimates of McKeague & Sasieni (1994), and Lin & Ying (1994). By using a fairly large number of intervals, we should get an essentially semiparametric model similar to the McKeagueSasieni and LinYing approaches. For our model, since the number of parameters is finite (although large), conventional approaches (such as maximum likelihood) are easy to formulate and implement. The approach is illustrated by simulations, and is applied to data from the Framingham heart study.
Show less  Date Issued
 2007
 Identifier
 FSU_migr_etd1464
 Format
 Thesis
 Title
 Monoclonal AntibodyBased Sandwich EnzymeLinked Immunosorbent Assay for the Detection of Mammalian Meat in Meat and Feed Products.
 Creator

Rao, Qinchun, Hsieh, YunHwa Peggy, Huffer, Fred W., Sathe, Shridhar K., Department of Nutrition, Food, and Exercise Science, Florida State University
 Abstract/Description

Detection of mammalian tissue in nonmammalian meat or feed products is important for enforcement of foodlabeling laws and prevention of the spread of transmissible spongiform encephalopathies (TSEs). This study was conducted to develop a monoclonal antibodybased sandwich enzymelinked immunosorbent assay (ELISA) for rapid detection of raw, cooked (100°C, 30 min) and autoclaved (121°C/1.2 bar, 30 min) mammalian meats (beef, deer, elk, horse, lamb and pork) adulterated in nonmammalian meat ...
Show moreDetection of mammalian tissue in nonmammalian meat or feed products is important for enforcement of foodlabeling laws and prevention of the spread of transmissible spongiform encephalopathies (TSEs). This study was conducted to develop a monoclonal antibodybased sandwich enzymelinked immunosorbent assay (ELISA) for rapid detection of raw, cooked (100°C, 30 min) and autoclaved (121°C/1.2 bar, 30 min) mammalian meats (beef, deer, elk, horse, lamb and pork) adulterated in nonmammalian meat (chicken, duck and turkey) and soybased feed products, and to assess the performance of the assay. This assay utilized a pair of MAbs against thermalstable skeletal muscle protein, troponin I (sTnI). MAb 6G1, specific to mammalian and poultry sTnIs, was used as the capture antibody and horseradish peroxidase (HRP) conjugated MAb 8F10, specific to mammalian sTnI, was used as the detection antibody. The assay conditions that were optimized include: the dilutions of the capture antibody and the detection antibody, the selection of the antibody buffer, the incubation time for antigenantibody binding, and the dilutions of the adulterated meat and feed samples. The optimized assay achieved a detection limit of 0.05% (w/w) for raw, 0.50% (w/w) for cooked and 1.00% (w/w) for autoclaved beef in turkey (P ≤ 0.05); 0.50% (w/w) for pork in chicken mixtures (raw, cooked and autoclaved) (P ≤ 0.05); and 0.50% (w/w) for bovine meat meal in soybased feed mixtures (P ≤ 0.05). The fat content (0 − 30%, w/w) of the meat samples did not significantly affect the assay signals (P ≥ 0.05). As the temperature and time of the heat treatment of the meat samples increased, the reactivity of this assay decreased slightly. However, the assay was still adequate to analyze samples subjected to the most severe heat treatment (132°C/2.0 bar, 120 min). This MAbbased sandwich ELISA is the first assay suitable for rapid, sensitive and reliable detection of undeclared mammalian proteins in meat and feed products, regardless of the extent of heat processing.
Show less  Date Issued
 2004
 Identifier
 FSU_migr_etd2122
 Format
 Thesis
 Title
 Discrete Frenet Frame with Application to Structural Biology and Kinematics.
 Creator

Lu, Yuanting, Quine, John R., Huffer, Fred W., Bertram, Richard, Cross, Timothy A., Cogan, Nick, Department of Mathematics, Florida State University
 Abstract/Description

The classical Frenet frame is a moving frame on a smooth curve. Connecting a sequence of points in space by line segments makes a discrete curve. The reference frame consisting of tangent, normal and binormal vectors at each point is defined as discrete Frenet frame (DFF). The DFF is useful in studying shapes of long molecules such as proteins. In this dissertation, we provide a solid mathematics foundation for DFF by showing the limit of the Frenet formula for DFF is the classical Frenet...
Show moreThe classical Frenet frame is a moving frame on a smooth curve. Connecting a sequence of points in space by line segments makes a discrete curve. The reference frame consisting of tangent, normal and binormal vectors at each point is defined as discrete Frenet frame (DFF). The DFF is useful in studying shapes of long molecules such as proteins. In this dissertation, we provide a solid mathematics foundation for DFF by showing the limit of the Frenet formula for DFF is the classical Frenet formula. As part of a survey of various ways to compute rigid body motion, we show the DenavitHartenberg (DH) conventions in robotics are a special case of the DFFs. Finally, we apply DFF to solve the kink angle problem in protein alpha helical structure using data from NMR experiments.
Show less  Date Issued
 2013
 Identifier
 FSU_migr_etd7477
 Format
 Thesis
 Title
 2D Affine and Projective Shape Analysis, and Bayesian Elastic Active Contours.
 Creator

Bryner, Darshan W., Srivastava, Anuj, Klassen, Eric, Gallivan, Kyle, Huffer, Fred, Wu, Wei, Zhang, Jinfeng, Department of Statistics, Florida State University
 Abstract/Description

An object of interest in an image can be characterized to some extent by the shape of its external boundary. Current techniques for shape analysis consider the notion of shape to be invariant to the similarity transformations (rotation, translation and scale), but often times in 2D images of 3D scenes, perspective effects can transform shapes of objects in a more complicated manner than what can be modeled by the similarity transformations alone. Therefore, we develop a general Riemannian...
Show moreAn object of interest in an image can be characterized to some extent by the shape of its external boundary. Current techniques for shape analysis consider the notion of shape to be invariant to the similarity transformations (rotation, translation and scale), but often times in 2D images of 3D scenes, perspective effects can transform shapes of objects in a more complicated manner than what can be modeled by the similarity transformations alone. Therefore, we develop a general Riemannian framework for shape analysis where metrics and related quantities are invariant to larger groups, the affine and projective groups, that approximate such transformations that arise from perspective skews. Highlighting two possibilities for representing object boundaries  ordered points (or landmarks) and parametrized curves  we study different combinations of these representations (points and curves) and transformations (affine and projective). Specifically, we provide solutions to three out of four situations and develop algorithms for computing geodesics and intrinsic sample statistics, leading up to Gaussiantype statistical models, and classifying test shapes using such models learned from training data. In the case of parametrized curves, an added issue is to obtain invariance to the reparameterization group. The geodesics are constructed by particularizing the pathstraightening algorithm to geometries of current manifolds and are used, in turn, to compute shape statistics and Gaussiantype shape models. We demonstrate these ideas using a number of examples from shape and activity recognition. After developing such Gaussiantype shape models, we present a variational framework for naturally incorporating these shape models as prior knowledge in guidance of active contours for boundary extraction in images. This socalled Bayesian active contour framework is especially suitable for images where boundary estimation is difficult due to low contrast, low resolution, and presence of noise and clutter. In traditional active contour models curves are driven towards minimum of an energy composed of image and smoothing terms. We introduce an additional shape term based on shape models of prior known relevant shape classes. The minimization of this total energy, using iterated gradientbased updates of curves, leads to an improved segmentation of object boundaries. We demonstrate this Bayesian approach to segmentation using a number of shape classes in many imaging scenarios including the synthetic imaging modalities of SAS (synthetic aperture sonar) and SAR (synthetic aperture radar), which are notoriously difficult to obtain accurate boundary extractions. In practice, the training shapes used for priorshape models may be collected from viewing angles different from those for the test images and thus may exhibit a shape variability brought about by perspective effects. Therefore, by allowing for a prior shape model to be invariant to, say, affine transformations of curves, we propose an active contour algorithm where the resulting segmentation is robust to perspective skews.
Show less  Date Issued
 2013
 Identifier
 FSU_migr_etd8534
 Format
 Thesis
 Title
 Nonparametric Estimation of Three Dimensional Projective Shapes with Applications in Medical Imaging and in Pattern Recognition.
 Creator

Crane, Michael, Patrangenaru, Victor, Liu, Xiuwen, Huﬀer, Fred W., Sinha, Debajyoti, Department of Statistics, Florida State University
 Abstract/Description

This dissertation is on analysis of invariants of a 3D configuration from its 2D images in pictures of this configuration, without requiring any restriction on the camera positioning relative to the scene pictured. We briefly review some of the main results found in the literature. The methodology used is nonparametric, manifold based combined with standard computer vision re construction techniques. More specifically, we use asymptotic results for the extrinsic sample mean and the extrinsic...
Show moreThis dissertation is on analysis of invariants of a 3D configuration from its 2D images in pictures of this configuration, without requiring any restriction on the camera positioning relative to the scene pictured. We briefly review some of the main results found in the literature. The methodology used is nonparametric, manifold based combined with standard computer vision re construction techniques. More specifically, we use asymptotic results for the extrinsic sample mean and the extrinsic sample covariance to construct boot strap confidence regions for mean projective shapes of 3D configurations. Chapters 4, 5 and 6 contain new results. In chapter 4, we develop tests for coplanarity. In chapter 5, is on reconstruction of 3D polyhedral scenes, including texture from arbitrary partial views. In chapter 6, we develop a nonparametric methodology for estimating the mean change for matched samples on a Lie group. We then notice that for k '' 4, a manifold of projective shapes of kads in general position in 3D has a structure of 3k and #8722; 15 dimensional Lie group (PQuaternions) that is equivariantly embedded in an Euclidean space, therefore testing for mean 3D projective shape change amounts to a one sample test for extrinsic mean PQuaternion Objects. The Lie group technique leads to a large sample and nonparametric bootstrap test for one population extrinsic mean on a projective shape space, as recently developed by Patrangenaru, Liu and Sughatadasa [1]. On the other hand, in absence of occlusions, the 3D projective shape of a spatial configuration can be recovered from a stereo pair of images, thus allowing to test for mean glaucomatous 3D projective shape change detection from standard stereo pairs of eye images.
Show less  Date Issued
 2010
 Identifier
 FSU_migr_etd7118
 Format
 Thesis
 Title
 Bayesian Portfolio Optimization with TimeVarying Factor Models.
 Creator

Zhao, Feng, Niu, Xufeng, Cheng, Yingmei, Huﬀer, Fred W., Zhang, Jinfeng, Department of Statistics, Florida State University
 Abstract/Description

We develop a modeling framework to simultaneously evaluate various types of predictability in stock returns, including stocks' sensitivity ("betas") to systematic risk factors, stocks' abnormal returns unexplained by risk factors ("alphas"), and returns of risk factors in excess of the riskfree rate ("risk premia"). Both firmlevel characteristics and macroeconomic variables are used to predict stocks' timevarying alphas and betas, and macroeconomic variables are used to predict the risk...
Show moreWe develop a modeling framework to simultaneously evaluate various types of predictability in stock returns, including stocks' sensitivity ("betas") to systematic risk factors, stocks' abnormal returns unexplained by risk factors ("alphas"), and returns of risk factors in excess of the riskfree rate ("risk premia"). Both firmlevel characteristics and macroeconomic variables are used to predict stocks' timevarying alphas and betas, and macroeconomic variables are used to predict the risk premia. All of the models are specified in a Bayesian framework to account for estimation risk, and informative prior distributions on both stock returns and model parameters are adopted to reduce estimation error. To gauge the economic signicance of the predictability, we apply the models to the U.S. stock market and construct optimal portfolios based on model predictions. Outofsample performance of the portfolios is evaluated to compare the models. The empirical results confirm predictabiltiy from all of the sources considered in our model: (1) The equity risk premium is timevarying and predictable using macroeconomic variables; (2) Stocks' alphas and betas differ crosssectionally and are predictable using firmlevel characteristics; and (3) Stocks' alphas and betas are also timevarying and predictable using macroeconomic variables. Comparison of different subperiods shows that the predictability of stocks' betas is persistent over time, but the predictability of stocks' alphas and the risk premium has diminished to some extent. The empirical results also suggest that Bayesian statistical techinques, especially the use of informative prior distributions, help reduce model estimation error and result in portfolios that outperform the passive indexing strategy. The findings are robust in the presence of transaction costs.
Show less  Date Issued
 2011
 Identifier
 FSU_migr_etd0526
 Format
 Thesis
 Title
 Nonparametric Estimation of Three Dimensional Projective Shapes with Applications in Medical Imaging and in Pattern Recognition.
 Creator

Crane, Michael, Patrangenaru, Victor, Liu, Xiuwen, Huﬀer, Fred W., Sinha, Debajyoti, Department of Statistics, Florida State University
 Abstract/Description

This dissertation is on analysis of invariants of a 3D configuration from its 2D images in pictures of this configuration, without requiring any restriction on the camera positioning relative to the scene pictured. We briefly review some of the main results found in the literature. The methodology used is nonparametric, manifold based combined with standard computer vision reconstruction techniques. More specifically, we use asymptotic results for the extrinsic sample mean and the extrinsic...
Show moreThis dissertation is on analysis of invariants of a 3D configuration from its 2D images in pictures of this configuration, without requiring any restriction on the camera positioning relative to the scene pictured. We briefly review some of the main results found in the literature. The methodology used is nonparametric, manifold based combined with standard computer vision reconstruction techniques. More specifically, we use asymptotic results for the extrinsic sample mean and the extrinsic sample covariance to construct bootstrap confidence regions for mean projective shapes of 3D configurations. Chapters 4, 5 and 6 contain new results. In chapter 4, we develop tests for coplanarity. In chapter 5, is on reconstruction of 3D polyhedral scenes, including texture from arbitrary partial views. In chapter 6, we develop a nonparametric methodology for estimating the mean change for matched samples on a Lie group. We then notice that for k ≥ 4, a manifold of projective shapes of kads in general position in 3D has a structure of 3k − 15 dimensional Lie group (PQuaternions) that is equivariantly embedded in an Euclidean space, therefore testing for mean 3D projective shape change amounts to a one sample test for extrinsic mean PQuaternion Objects. The Lie group technique leads to a large sample and nonparametric bootstrap test for one population extrinsic mean on a projective shape space, as recently developed by Patrangenaru, Liu and Sughatadasa. On the other hand, in absence of occlusions, the 3D projective shape of a spatial configuration can be recovered from a stereo pair of images, thus allowing to test for mean glaucomatous 3D projective shape change detection from standard stereo pairs of eye images.
Show less  Date Issued
 2010
 Identifier
 FSU_migr_etd4607
 Format
 Thesis
 Title
 Combining Regression Slopes from Studies with Different Models in MetaAnalysis.
 Creator

Jeon, Sanghyun, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Yang, Yanyun, Paek, Insu, Florida State University, College of Education, Department of Educational...
Show moreJeon, Sanghyun, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Yang, Yanyun, Paek, Insu, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

Primary studies are using complex models more and more. Slopes from multiple regression analyses are reported in primary studies, but few scholars have dealt with how to combine multiple regression slopes. One of the problems in combining multiple regression slopes is that each study may use a different regression model. The purpose of this research is to propose a method for combining partial regression slopes from studies with different regression models. The method combines comparable...
Show morePrimary studies are using complex models more and more. Slopes from multiple regression analyses are reported in primary studies, but few scholars have dealt with how to combine multiple regression slopes. One of the problems in combining multiple regression slopes is that each study may use a different regression model. The purpose of this research is to propose a method for combining partial regression slopes from studies with different regression models. The method combines comparable covariance matrices to obtain a synthetic partial slope. The proposed method assumes the population is homogeneous, and that the different regression models are nested. Elements in the sample covariance matrix are not independent of each other, so missing elements should be imputed using conditional expectations. The Bartlett decomposition is used to decompose the sample covariance matrix into a parameter component and a sampling error component. The proposed method treats the samplesize weighted average as a parameter matrix and applies Bartlett’s decomposition to the sample covariance matrices to get their respective error matrices. Since missing elements in the error matrix are not correlated, missing elements can be estimated in the error matrices and hence in the parameter matrices. Finally the partial slopes can be computed from the combined matrices. Simulation shows the suggested method gives smaller standard errors than the listwisedeletion method and the pairwisedeletion method. An empirical examination shows the suggested method can be applied to heterogeneous populations.
Show less  Date Issued
 2017
 Identifier
 FSU_FALL2017_Jeon_fsu_0071E_14179
 Format
 Thesis
 Title
 Examining the Effect of Treatment on the Distribution of Blood Pressure in the Population Using Observational Data.
 Creator

Kucukemiroglu, Saryet Alexa, McGee, Daniel, Slate, Elizabeth H., Hurt, Myra M., Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences,...
Show moreKucukemiroglu, Saryet Alexa, McGee, Daniel, Slate, Elizabeth H., Hurt, Myra M., Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences, Department of Statistics
Show less  Abstract/Description

Since the introduction of antihypertensive medications in the mid1950s, there has been an increased use of blood pressure medications in the US. The growing use of antihypertensive treatment has affected the distribution of blood pressure in the population over time. Now observational data no longer reflect natural blood pressure levels. Our goal is to examine the effect of antihypertensive drugs on distributions of blood pressure using several wellknown observational studies. The...
Show moreSince the introduction of antihypertensive medications in the mid1950s, there has been an increased use of blood pressure medications in the US. The growing use of antihypertensive treatment has affected the distribution of blood pressure in the population over time. Now observational data no longer reflect natural blood pressure levels. Our goal is to examine the effect of antihypertensive drugs on distributions of blood pressure using several wellknown observational studies. The statistical concept of censoring is used to estimate the distribution of blood pressure in populations if no treatment were available. The treated and estimated untreated distributions are then compared to determine the general effect of these medications in the population. Our analyses show that these drugs have an increasing impact on controlling blood pressure distributions in populations that are heavily treated.
Show less  Date Issued
 2017
 Identifier
 FSU_FALL2017_Kucukemiroglu_fsu_0071E_14275
 Format
 Thesis
 Title
 SemiParametric Generalized Estimating Equations with Kernel Smoother: A Longitudinal Study in Financial Data Analysis.
 Creator

Yang, Liu, Niu, Xufeng, Cheng, Yingmei, Huffer, Fred W. (Fred William), Tao, Minjing, Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

Longitudinal studies are widely used in various fields, such as public health, clinic trials and financial data analysis. A major challenge for longitudinal studies is repeated measurements from each subject, which cause time dependent correlation within subjects. Generalized Estimating Equations can deal with correlated outcomes for longitudinal data through marginal effect. My model will base on Generalized Estimating Equations with semiparametric approach, providing a flexible structure...
Show moreLongitudinal studies are widely used in various fields, such as public health, clinic trials and financial data analysis. A major challenge for longitudinal studies is repeated measurements from each subject, which cause time dependent correlation within subjects. Generalized Estimating Equations can deal with correlated outcomes for longitudinal data through marginal effect. My model will base on Generalized Estimating Equations with semiparametric approach, providing a flexible structure for regression models: coefficients for parametric covariates will be estimated and nuisance covariates will be fitted in kernel smoothers for nonparametric part. Profile kernel estimator and the seemingly unrelated kernel estimator (SUR) will be used to deliver consistent and efficient semiparametric estimators comparing to parametric models. We provide simulation results for estimating semiparametric models with one or multiple nonparametric terms. In application part, we would like to focus on financial market: a credit card loan data will be used with the payment information for each customer across 6 months, investigating whether gender, income, age or other factors will influence payment status significantly. Furthermore, we propose model comparisons to evaluate whether our model should be fitted based on different levels of factors, such as male and female or based on different types of estimating methods, such as parametric estimation or semiparametric estimation.
Show less  Date Issued
 2017
 Identifier
 FSU_FALL2017_YANG_fsu_0071E_14219
 Format
 Thesis
 Title
 Bayesian Modeling and Variable Selection for Complex Data.
 Creator

Li, Hanning, Pati, Debdeep, Huffer, Fred W. (Fred William), Kercheval, Alec N., Sinha, Debajyoti, Bradley, Jonathan R., Florida State University, College of Arts and Sciences,...
Show moreLi, Hanning, Pati, Debdeep, Huffer, Fred W. (Fred William), Kercheval, Alec N., Sinha, Debajyoti, Bradley, Jonathan R., Florida State University, College of Arts and Sciences, Department of Statistics
Show less  Abstract/Description

As we routinely encounter highthroughput datasets in complex biological and environment research, developing novel models and methods for variable selection has received widespread attention. In this dissertation, we addressed a few key challenges in Bayesian modeling and variable selection for highdimensional data with complex spatial structures. a) Most Bayesian variable selection methods are restricted to mixture priors having separate components for characterizing the signal and the...
Show moreAs we routinely encounter highthroughput datasets in complex biological and environment research, developing novel models and methods for variable selection has received widespread attention. In this dissertation, we addressed a few key challenges in Bayesian modeling and variable selection for highdimensional data with complex spatial structures. a) Most Bayesian variable selection methods are restricted to mixture priors having separate components for characterizing the signal and the noise. However, such priors encounter computational issues in high dimensions. This has motivated continuous shrinkage priors, resembling the twocomponent priors facilitating computation and interpretability. While such priors are widely used for estimating highdimensional sparse vectors, selecting a subset of variables remains a daunting task. b) Spatial/spatialtemporal data sets with complex structures are nowadays commonly encountered in various scientific research fields ranging from atmospheric sciences, forestry, environmental science, biological science, and social science. Selecting important spatial variables that have significant influences on occurrences of events is undoubtedly necessary and essential for providing insights to researchers. Selfexcitation, which is a feature that occurrence of an event increases the likelihood of more occurrences of the same type of events nearby in time and space, can be found in many natural/social events. Research on modeling data with selfexcitation feature has increasingly drawn interests recently. However, existing literature on selfexciting models with inclusion of highdimensional spatial covariates is still underdeveloped. c) Gaussian Process is among the most powerful model frames for spatial data. Its major bottleneck is the computational complexity which stems from inversion of dense matrices associated with a Gaussian process covariance. Hierarchical divideconquer Gaussian Process models have been investigated for ultra large data sets. However, computation associated with scaling the distributing computing algorithm to handle a large number of subgroups poses a serious bottleneck. In chapter 2 of this dissertation, we propose a general approach for variable selection with shrinkage priors. The presence of very few tuning parameters makes our method attractive in comparison to ad hoc thresholding approaches. The applicability of the approach is not limited to continuous shrinkage priors, but can be used along with any shrinkage prior. Theoretical properties for nearcollinear design matrices are investigated and the method is shown to have good performance in a wide range of synthetic data examples and in a real data example on selecting genes affecting survival due to lymphoma. In Chapter 3 of this dissertation, we propose a new selfexciting model that allows the inclusion of spatial covariates. We develop algorithms which are effective in obtaining accurate estimation and variable selection results in a variety of synthetic data examples. Our proposed model is applied on Chicago crime data where the influence of various spatial features is investigated. In Chapter 4, we focus on a hierarchical Gaussian Process regression model for ultrahigh dimensional spatial datasets. By evaluating the latent Gaussian process on a regular grid, we propose an efficient computational algorithm through circulant embedding. The latent Gaussian process borrows information across multiple subgroups, thereby obtaining a more accurate prediction. The hierarchical model and our proposed algorithm are studied through simulation examples.
Show less  Date Issued
 2017
 Identifier
 FSU_FALL2017_Li_fsu_0071E_14159
 Format
 Thesis
 Title
 The Impact of Unbalanced Designs on the Performance of Parametric and Nonparametric DIF Procedures: A Comparison of Mantel Haenszel, Logistic Regression, SIBTEST, and IRTLR Procedures.
 Creator

Alghamdi, Abdullah Ahmed, Paek, Insu, Huffer, Fred W. (Fred William), Becker, Betsy Jane, Yang, Yanyun, Florida State University, College of Education, Department of Educational...
Show moreAlghamdi, Abdullah Ahmed, Paek, Insu, Huffer, Fred W. (Fred William), Becker, Betsy Jane, Yang, Yanyun, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

The current study examined the impact of unbalanced sample sizes between focal and reference groups on the Type I error rates and DIF detection rates (power) of five DIF procedures (MH, LR, general IRTLR, IRTLRb, and SIBTEST). Five simulation factors were used in this study. Four factors were for generating simulation data and they were sample size, DIF magnitude, group mean ability difference (impact), and the studied item difficulty. The fifth factor was the DIF method factor that included...
Show moreThe current study examined the impact of unbalanced sample sizes between focal and reference groups on the Type I error rates and DIF detection rates (power) of five DIF procedures (MH, LR, general IRTLR, IRTLRb, and SIBTEST). Five simulation factors were used in this study. Four factors were for generating simulation data and they were sample size, DIF magnitude, group mean ability difference (impact), and the studied item difficulty. The fifth factor was the DIF method factor that included MH, LR, general IRTLR, IRTLRb, and SIBTEST. A repeatedmeasures ANOVA, where the DIF method factor was the withinsubjects variable, was performed to compare the performance of the five DIF procedures and to discover their interactions with other factors. For each data generation condition, 200 replications were made. Type I error rates for MH and IRTLR DIF procedures were close to or lower than 5%, the nominal level for different sample size levels. On average, the Type I error rates for IRTLRb and SIBTEST were 5.7%, and 6.4%, respectively. In contrast, the LR DIF procedure seems to have a higher Type I error rate, which ranged from 5.3% to 8.1% with 6.9% on average. When it comes to the rejection rate under DIF conditions, or the DIF detection rate, the IRTLRb showed the highest DIF detection rate followed by SIBTEST with averages of 71.8% and 68.4%, respectively. Overall, the impact of unbalanced sample sizes between reference and focal groups on the performance of DIF detection showed a similar tendency for all methods, generally increasing DIF detection rates as the total sample size increased. In practice, IRTLRb, which showed the best performance for DIF detection rates and controlled for the Type I error rates, should be the choice when the modeldata fit is reasonable. If other nonIRT DIF methods are considered, MH or SIBTEST could be used, depending on which type of error (Type I or II) is more seriously considered.
Show less  Date Issued
 2017
 Identifier
 FSU_FALL2017_Alghamdi_fsu_0071E_14180
 Format
 Thesis
 Title
 The Impact of Rater Variability on Relationships among Different EffectSize Indices for InterRater Agreement between Human and Automated Essay Scoring.
 Creator

Yun, Jiyeo, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Paek, Insu, Zhang, Qian, Florida State University, College of Education, Department of Educational Psychology and...
Show moreYun, Jiyeo, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Paek, Insu, Zhang, Qian, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

Since researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for interrater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for interrater agreement used to assess the relatedness of human and automated essay scoring, and to examine impacts of rater variability on interrater agreement. To implement...
Show moreSince researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for interrater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for interrater agreement used to assess the relatedness of human and automated essay scoring, and to examine impacts of rater variability on interrater agreement. To implement the investigations, my study consists of two parts: empirical and simulation studies. Based on the results from the empirical study, the overall effects for interrater agreement were .63 and .99 for exact and adjacent proportions of agreement, .48 for kappas, and between .75 and .78 for correlations. Additionally, significant differences between 6point scales and the other scales (i.e., 3, 4, and 5point scales) for correlations, kappas and proportions of agreement existed. Moreover, based on the results of the simulated data, the highest agreements and lowest discrepancies achieved in the matched rater distribution pairs. Specifically, the means of exact and adjacent proportions of agreement, kappa and weighted kappa values, and correlations were .58, .95, .42, .78, and .78, respectively. Meanwhile the average standardized mean difference was .0005 in the matched rater distribution pairs. Acceptable values for interrater agreement as evaluation criteria for automated essay scoring, impacts of rater variability on interrater agreement, and relationships among interrater agreement indices were discussed.
Show less  Date Issued
 2017
 Identifier
 FSU_FALL2017_Yun_fsu_0071E_14144
 Format
 Thesis
 Title
 Spatial Statistics and Its Applications in Biostatistics and Environmental Statistics.
 Creator

Hu, Guanyu, Huffer, Fred W. (Fred William), Paek, Insu, Sinha, Debajyoti, Slate, Elizabeth H., Bradley, Jonathan R., Florida State University, College of Arts and Sciences,...
Show moreHu, Guanyu, Huffer, Fred W. (Fred William), Paek, Insu, Sinha, Debajyoti, Slate, Elizabeth H., Bradley, Jonathan R., Florida State University, College of Arts and Sciences, Department of Statistics
Show less  Abstract/Description

This dissertation presents some topics in spatial statistics and their application in biostatistics and environmental statistics. The field of spatial statistics is an energetic area in statistics. In Chapter 2 and Chapter 3, the goal is to build subregion models under the assumption that the responses or the parameters are spatially correlated. For regression models, considering spatially varying coecients is a reasonable way to build subregion models. There are two different techniques for...
Show moreThis dissertation presents some topics in spatial statistics and their application in biostatistics and environmental statistics. The field of spatial statistics is an energetic area in statistics. In Chapter 2 and Chapter 3, the goal is to build subregion models under the assumption that the responses or the parameters are spatially correlated. For regression models, considering spatially varying coecients is a reasonable way to build subregion models. There are two different techniques for exploring spatially varying coecients. One is geographically weighted regression (Brunsdon et al. 1998). The other is a spatially varying coecients model which assumes a stationary Gaussian process for the regression coecients (Gelfand et al. 2003). Based on the ideas of these two techniques, we introduce techniques for exploring subregion models in survival analysis which is an important area of biostatistics. In Chapter 2, we introduce modied versions of the KaplanMeier and NelsonAalen estimators which incorporate geographical weighting. We use ideas from counting process theory to obtain these modied estimators, to derive variance estimates, and to develop associated hypothesis tests. In Chapter 3, we introduce a Bayesian parametric accelerated failure time model with spatially varying coefficients. These two techniques can explore subregion models in survival analysis using both nonparametric and parametric approaches. In Chapter 4, we introduce Bayesian parametric covariance regression analysis for a response vector. The proposed method denes a regression model between the covariance matrix of a pdimensional response vector and auxiliary variables. We propose a constrained MetropolisHastings algorithm to get the estimates. Simulation results are presented to show performance of both regression and covariance matrix estimates. Furthermore, we have a more realistic simulation experiment in which our Bayesian approach has better performance than the MLE. Finally, we illustrate the usefulness of our model by applying it to the Google Flu data. In Chapter 5, we give a brief summary of future work.
Show less  Date Issued
 2017
 Identifier
 FSU_FALL2017_Hu_fsu_0071E_14205
 Format
 Thesis
 Title
 Median Regression for Complex Survey Data.
 Creator

Fraser, Raphael André, Sinha, Debajyoti, Lipsitz, Stuart, Carlson, Elwood, Slate, Elizabeth H., Huffer, Fred W. (Fred William), Florida State University, College of Arts and...
Show moreFraser, Raphael André, Sinha, Debajyoti, Lipsitz, Stuart, Carlson, Elwood, Slate, Elizabeth H., Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences, Department of Statistics
Show less  Abstract/Description

The ready availability of publicuse data from various large national complex surveys has immense potential for the assessment of population characteristicsmeans, proportions, totals, etcetera. Using a modelbased approach, complex surveys can be used to evaluate the effectiveness of treatments and to identify risk factors for important diseases such as cancer. Existing statistical methods based on estimating equations and/or utilizing resampling methods are often not valid with survey data...
Show moreThe ready availability of publicuse data from various large national complex surveys has immense potential for the assessment of population characteristicsmeans, proportions, totals, etcetera. Using a modelbased approach, complex surveys can be used to evaluate the effectiveness of treatments and to identify risk factors for important diseases such as cancer. Existing statistical methods based on estimating equations and/or utilizing resampling methods are often not valid with survey data due to design features such as stratification, multistage sampling and unequal selection probabilities. In this paper, we accommodate these design features in the analysis of highly skewed response variables arising from large complex surveys. Specifically, we propose a doubletransformbothsides based estimating equations approach to estimate the median regression parameters of the highly skewed response; the doubletransformbothsides method applies the same transformation twice to both the response and regression function. The usual sandwich variance estimate can be used in our approach, whereas a resampling approach would be needed for a pseudolikelihood based on minimizing absolute deviations. Furthermore, the doubletransformbothsides estimator is relatively robust to the true underlying distribution, and has much smaller mean square error than the least absolute deviations estimator. The method is motivated by an analysis of laboratory data on urinary iodine concentration from the National Health and Nutrition Examination Survey.
Show less  Date Issued
 2015
 Identifier
 FSU_2015fall_Fraser_fsu_0071E_12825
 Format
 Thesis
 Title
 Four Methods for Combining Dependent Effects from Studies Reporting Regression Analysis.
 Creator

Gunter, Tracey Danielle, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Almond, Russell G., Paek, Insu, Florida State University, College of Education, Department of...
Show moreGunter, Tracey Danielle, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Almond, Russell G., Paek, Insu, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

Over the years a variety of indices have been proposed to summarize regression analyses. Unfortunately the proposed indices are only appropriate when metaanalysts want to understand the role of a single predictor variable in predicting the outcome variable. However, sometimes metaanalysts want to understand the effect of a set of variables on an outcome variable. In this paper, four methods are presented for obtaining a composite effect for two focal predictor variables from a single...
Show moreOver the years a variety of indices have been proposed to summarize regression analyses. Unfortunately the proposed indices are only appropriate when metaanalysts want to understand the role of a single predictor variable in predicting the outcome variable. However, sometimes metaanalysts want to understand the effect of a set of variables on an outcome variable. In this paper, four methods are presented for obtaining a composite effect for two focal predictor variables from a single regression model. The indices are the average of the standardized regression coefficients (ASC), the average of the standardized regression coefficients using Hedges and Olkin's (1985) approach (AHO), the sheaf coefficient (SC), and the squared multiple semipartial correlation coefficient (MSP). A simulation study was conducted to examine the behavior of the indices and their variance when the number of predictor variables in the model, the sample size, the correlations between the focal predictor variables in the model, and the correlations between the focal and nonfocal predictor variables in the model were manipulated. The results of the study show that the average bias values of the ASC and AHO estimates are small even when the sample size is small. Furthermore, the ASC and AHO estimates and their estimated variances are more precise than the other indices under all conditions examined. Therefore, when metaanalysts are interested in estimating the effect of a set of predictor variables on an outcome variable from a single regression model, the ASC or AHO procedures are preferred.
Show less  Date Issued
 2015
 Identifier
 FSU_2015fall_Gunter_fsu_0071E_12829
 Format
 Thesis
 Title
 Sorvali Dilatation and Spin Divisors on Riemann and Klein Surfaces.
 Creator

Almalki, Yahya Ahmed, Nolder, Craig, Huffer, Fred W. (Fred William), Klassen, E. (Eric), Klassen, E. (Eric), van Hoeij, Mark, Florida State University, College of Arts and...
Show moreAlmalki, Yahya Ahmed, Nolder, Craig, Huffer, Fred W. (Fred William), Klassen, E. (Eric), Klassen, E. (Eric), van Hoeij, Mark, Florida State University, College of Arts and Sciences, Department of Mathematics
Show less  Abstract/Description

We review the Sorvali dilatation of isomorphisms of covering groups of Riemann surfaces and extend the definition to groups containing glidereflections. Then we give a bound for the distance between two surfaces, one of them resulting from twisting the other at a decomposing curve. Furthermore, we study spin structures on Riemann and Klein surfaces in terms of divisors. In particular, we take a closer look at spin structures on hyperelliptic and pgonal surfaces defined by divisors supported...
Show moreWe review the Sorvali dilatation of isomorphisms of covering groups of Riemann surfaces and extend the definition to groups containing glidereflections. Then we give a bound for the distance between two surfaces, one of them resulting from twisting the other at a decomposing curve. Furthermore, we study spin structures on Riemann and Klein surfaces in terms of divisors. In particular, we take a closer look at spin structures on hyperelliptic and pgonal surfaces defined by divisors supported on branch points. Moreover, we study invariant spin divisors under automorphisms and antiholomorphic involutions of Riemann surfaces.
Show less  Date Issued
 2017
 Identifier
 FSU_SUMMER2017_ALMALKI_fsu_0071E_14064
 Format
 Thesis
 Title
 Improvement of Quality Prediction in InterConnected Manufacturing System by Integrating MultiSource Data.
 Creator

Ren, Jie, Wang, Hui, Vanli, Omer Arda, Park, Chiwoo, Huffer, Fred W. (Fred William), Florida State University, FAMUFSU College of Engineering, Department of Industrial and...
Show moreRen, Jie, Wang, Hui, Vanli, Omer Arda, Park, Chiwoo, Huffer, Fred W. (Fred William), Florida State University, FAMUFSU College of Engineering, Department of Industrial and Manufacturing Engineering
Show less  Abstract/Description

With the development of advanced sensing and network technology such as wireless data transmission and data storage and analytics under cloud platforms, the manufacturing plant is going through a new revolution, by which different production units/components can communicate with each other, leading to interconnected manufacturing. The interconnection enables the close coordination of process control actions among machines to improve product quality. Traditional quality prediction methods...
Show moreWith the development of advanced sensing and network technology such as wireless data transmission and data storage and analytics under cloud platforms, the manufacturing plant is going through a new revolution, by which different production units/components can communicate with each other, leading to interconnected manufacturing. The interconnection enables the close coordination of process control actions among machines to improve product quality. Traditional quality prediction methods that focus on the data from one single source are not sufficient to deal with the variation modeling, and quality prediction problems involved the interconnected manufacturing. Instead, new quality prediction methods that can integrate the data from multiple sources are necessary. This research addresses the fundamental challenges in improving quality prediction by data fusion for interconnected manufacturing including knowledge sharing and transfer among different machines and collaboration error monitoring. The methodology is demonstrated through surface machining and additive manufacturing processes. The first study is on the surface quality prediction for one machining process by fusing multiresolution spatial data measured from multiple surfaces or different surface machining processes. The surface variation is decomposed into a global trend part that characterizes the spatially varying relationship of selected process variables and surface height and a zeromean spatial Gaussian process part. Three models including two varying coefficientbased spatial models and an inference rulebased spatial model are proposed and compared. Also, transfer learning technique is used to help train the model via transferring useful information from a datarich surface to a datalacking surface, which demonstrates the advantage of interconnected manufacturing. The second study deals with the surface mating errors caused by the surface variations from two interconnected surface machining processes. A model aggregating data from two surfaces is proposed to predict the leak areas for surface assembly. By using the measurements of leak areas and the profiles of surfaces mated as training data along with Hagen–Poiseuille law, this study develops a novel diagnostic method to predict potential leak areas (leakage paths). The effectiveness and robustness of the proposed method are verified by an experiment and a simulation study. The approach provides practical guidance for the subsequent assembly process as well as troubleshooting in manufacturing processes. The last study focuses on the learning of quality prediction model in interconnected additive manufacturing systems, by which different 3D printing processes involved are driven by similar printing mechanisms and can exchange quality data via a network. A quality prediction model that estimates the printing widths along the printing paths for materialextrusionbased additive manufacturing (a.k.a., fused filament fabrication or fused deposition modeling) is established by leveraging the betweenprinter quality data. The established mathematical model quantifies the printing linewidth along the printing paths based on the kinematic parameters, e.g., printing speed and acceleration while considering data from multiple printers that contain betweenmachines similarity. The method can allow for the betweenprinter knowledge sharing to improve the quality prediction so that a printing process with limited historical data can quickly learn an effective quality model without intensive retraining, thus improving the system responsiveness to product variety. In the long run, the outcome of this research can help contribute to the development of highefficient InternetofThings manufacturing services for personalized products.
Show less  Date Issued
 2019
 Identifier
 2019_Spring_Ren_fsu_0071E_15160
 Format
 Thesis
 Title
 Marked Determinantal Point Processes.
 Creator

Feng, Yiming, Nolder, Craig, Niu, Xufeng, Bradley, Jonathan R., Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

Determinantal point processes (DPPs), which can be dened by their correlation kernels with known moments, are useful models for point patterns where nearby points exhibit repulsion. They have many nice properties, such as closedform densities, tractable estimation of parameterized families, and no edge eects. In the past, univariate DPPs have been wellstudied, both in discrete and continuous settings although their statistical applications are fairly recent and still rather limited, whereas...
Show moreDeterminantal point processes (DPPs), which can be dened by their correlation kernels with known moments, are useful models for point patterns where nearby points exhibit repulsion. They have many nice properties, such as closedform densities, tractable estimation of parameterized families, and no edge eects. In the past, univariate DPPs have been wellstudied, both in discrete and continuous settings although their statistical applications are fairly recent and still rather limited, whereas the multivariate DPPs, or the socalled multitype marked DPPs, have been little explored. In this thesis, we propose a class of multivariate DPPs based on a block kernel construction. For the marked DPP, we show that the conditions of existence of DPP can easily be satised. The block construction allows us to model the individually marked DPPs as well as controlling the scale of repulsion of points having dierent marks. Unlike other researchers who model the kernel function of a DPP, we model its spectral representation, which not only guarantees the existence of the multivariate DPP, but makes the simulationbased estimation methods readily available. In our research, we adopted bivariate complex Fourier basis, which demonstrates nice properties such as constant intensity and approximate isotropy within a short distance between the nearby points. The parameterized block kernels can approximate to commonlyused covariance functions using Fourier expansion. The parameters can be estimated using Maximum Likelihood Estimation, Bayesian approach and Minimum Contrast Estimation.
Show less  Date Issued
 2019
 Identifier
 2019_Spring_Feng_fsu_0071E_15011
 Format
 Thesis
 Title
 Bayesian Tractography Using Geometric Shape Priors.
 Creator

Dong, Xiaoming, Srivastava, Anuj, Klassen, E. (Eric), Wu, Wei, Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

Diffusionweighted image(DWI) and tractography have been developed for decades and are key elements in recent, largescale efforts for mapping the human brain. The two techniques together provide us a unique possibility to access the macroscopic structure and connectivity of the human brain noninvasively and in vivo. The information obtained not only can help visualize brain connectivity and help segment the brain into different functional areas but also provides tools for understanding some...
Show moreDiffusionweighted image(DWI) and tractography have been developed for decades and are key elements in recent, largescale efforts for mapping the human brain. The two techniques together provide us a unique possibility to access the macroscopic structure and connectivity of the human brain noninvasively and in vivo. The information obtained not only can help visualize brain connectivity and help segment the brain into different functional areas but also provides tools for understanding some major cognitive diseases such as multiple sclerosis, schizophrenia, epilepsy, etc. There are lots of efforts have been put into this area. On the one hand, a vast spectrum of tractography algorithms have been developed in recent years, ranging from deterministic approaches through probabilistic methods to global tractography; On the other hand, various mathematical models, such as diffusion tensor, multitensor model, spherical deconvolution, Qball modeling, have been developed to better exploit the acquisition dependent signal of Diffusionweighted image(DWI). Despite considerable progress in this area, current methods still face many challenges, such as sensitive to noise, lots of false positive/negative fibers, incapable of handling complex fiber geometry and expensive computation cost. More importantly, recent researches have shown that, even with highquality data, the results using current tractography methods may not be improved, suggesting that it is unlikely to obtain an anatomically accurate map of the human brain solely based on the diffusion profile. Motivated by these issues, this dissertation develops a global approach that incorporates anatomical validated geometric shape prior when reconstructing neuron fibers. The fiber tracts between regions of interest are initialized and updated via deformations based on gradients of the posterior energy defined in this paper. This energy has contributions from diffusion data, shape prior information, and roughness penalty. The dissertation first describes and demonstrates the proposed method on the 2D dataset and then extends it to 3D Phantom data and the real brain data. The results show that the proposed method is relatively immune to issues such as noise, complicated fiber structure like fiber crossings and kissing, false positive fibers, and achieve more explainable tractography results.
Show less  Date Issued
 2019
 Identifier
 2019_Spring_DONG_fsu_0071E_15144
 Format
 Thesis
 Title
 Envelopes, Subspace Learning and Applications.
 Creator

Wang, Wenjing, Zhang, Xin, Tao, Minjing, Li, Wen, Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

Envelope model is a nascent dimension reduction technique. We focus on extending the envelope methodology to broader applications. In the first part of this thesis we propose a common reducing subspace model that can simultaneously estimating covariance, precision matrices and their differences across multiple populations. This model leads to substantial dimension reduction and efficient parameter estimation. We explicitly quantify the efficiency gain through an asymptotic analysis. In the...
Show moreEnvelope model is a nascent dimension reduction technique. We focus on extending the envelope methodology to broader applications. In the first part of this thesis we propose a common reducing subspace model that can simultaneously estimating covariance, precision matrices and their differences across multiple populations. This model leads to substantial dimension reduction and efficient parameter estimation. We explicitly quantify the efficiency gain through an asymptotic analysis. In the second part, we propose a set of new mixture models called CLEMM (Clustering with Envelope Mixture Models) that is based on the widely used Gaussian mixture model assumptions. The proposed CLEMM framework and the associated envelopeEM algorithms provides the foundations for envelope methodology in unsupervised and semisupervised learning problems. We also illustrate the performance of these models with simulation studies and empirical applications. Also, we have extended the envelope discriminant analysis from vector data to tensor data in the third part of this thesis. Another study on copulabased models for forecasting realized volatility matrix is included, which is an important financial application of estimating covariance matrices. We consider multivariatet, Clayton, and bivariate t, Gumbel, Clayton copulas to model and forecast oneday ahead realized volatility matrices. Empirical results show that copula based models can achieve significant performance both in terms of statistical precision and economical efficiency.
Show less  Date Issued
 2019
 Identifier
 2019_Spring_Wang_fsu_0071E_15085
 Format
 Thesis
 Title
 Impact of Violations of Measurement Invariance in Longitudinal Mediation Modeling.
 Creator

Xu, Jie, Yang, Yanyun, Zhang, Qian, Huffer, Fred W. (Fred William), Becker, Betsy J., Florida State University, College of Education, Department of Educational Psychology and...
Show moreXu, Jie, Yang, Yanyun, Zhang, Qian, Huffer, Fred W. (Fred William), Becker, Betsy J., Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

Research has shown that crosssectional mediation analysis cannot accurately reflect a true longitudinal mediated effect. To investigate longitudinal mediated effects, different longitudinal mediation models have been proposed and these models focus on different research questions related to longitudinal mediation. When fitting mediation models to longitudinal data, the assumption of longitudinal measurement invariance is usually made. However, the consequences of violating this assumption...
Show moreResearch has shown that crosssectional mediation analysis cannot accurately reflect a true longitudinal mediated effect. To investigate longitudinal mediated effects, different longitudinal mediation models have been proposed and these models focus on different research questions related to longitudinal mediation. When fitting mediation models to longitudinal data, the assumption of longitudinal measurement invariance is usually made. However, the consequences of violating this assumption have not been thoroughly studied in mediation analysis. No studies have examined issues of measurement noninvariance in a latent crosslagged panel mediation (LCPM) model with three or more measurement occasions. The goal of the current study is to investigate the impact of violations of measurement invariance on longitudinal mediation analysis. The focal model in the study is the LCPM model suggested by Cole and Maxwell (2003). This model can be used to examine mediated effects among the latent predictor, mediator, and outcome variables across time. In addition, it can account for measurement error and allow for the evaluation of longitudinal measurement invariance. Simulation methods were used and the investigation was performed using population covariance matrices and sample data generated under various conditions. Eight design factors were considered for data generation: sample size, proportion of noninvariant items, position of latent factors with noninvariant items, type of noninvariant parameters, magnitude of noninvariance, pattern of noninvariance, size of the direct effect, and size of the mediated effect. Results from population investigation were evaluated based on overall model fit and the calculated direct and mediated effects; results from finite sample analysis were evaluated in terms of convergence and inadmissible solutions, overall model fit, bias/relative bias, coverage rates, and statistical power/type I error rates. In general, results obtained from finite sample analysis were consistent with those from the population investigation, with respect to both model fit and parameter estimation. The type I error rate of the mediated effects was inflated under the noninvariant conditions with small sample size (200); power of the direct and mediated effects was excellent (1.0 or close to 1.0) across all investigated conditions. Type I error rates based on the chisquare statistic test were seriously inflated under the invariant conditions, especially when the sample size was relatively small. Power for detecting model misspecifications due to longitudinal noninvariance was excellent across all investigated conditions. Fit indices (CFI, TLI, RMSEA, and SRMR) were not sensitive in detecting misspecifications caused by violations of measurement invariance in the investigated LCPM model. Study results also showed that as the magnitude of noninvariance, the proportion of noninvariant items, and the number of positions of latent variables with noninvariant items increased, estimation of the direct and mediated effects tended to be less accurate. The decreasing pattern of change in item parameters over measurement occasions resulted in the least accurate estimates of the direct and mediated effects. Parameter estimates were fairly accurate under the conditions of the decreasing and then increasing pattern and the mixed pattern of change in item parameters. Findings from this study can help empirical researchers better understand the potential impact of violating measurement invariance on longitudinal mediation analysis using the LCPM model.
Show less  Date Issued
 2019
 Identifier
 2019_Spring_Xu_fsu_0071E_14994
 Format
 Thesis
 Title
 Random Walks over Point Processes and Their Application in Finance.
 Creator

Salehy, Seyyed Navid, Kercheval, Alec N., Ewald, Brian, Fahim, Arash, Ökten, Giray, Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences,...
Show moreSalehy, Seyyed Navid, Kercheval, Alec N., Ewald, Brian, Fahim, Arash, Ökten, Giray, Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences, Department of Mathematics
Show less  Abstract/Description

In continuoustime models in finance, it is common to assume that prices follow a geometric Brownian motion. More precisely, it is assumed that the price at time t ≥ 0 is given by Zt = Z₀exp(σBt + mt) where Z₀ is the initial price, B is standard Brownian motion, σ is the volatility, and m is the drift. We discuss how Z can be viewed as the limit of a sequence of discrete price models based on random walks. We note that in the usual random walks, jumps can only happen at deterministic times....
Show moreIn continuoustime models in finance, it is common to assume that prices follow a geometric Brownian motion. More precisely, it is assumed that the price at time t ≥ 0 is given by Zt = Z₀exp(σBt + mt) where Z₀ is the initial price, B is standard Brownian motion, σ is the volatility, and m is the drift. We discuss how Z can be viewed as the limit of a sequence of discrete price models based on random walks. We note that in the usual random walks, jumps can only happen at deterministic times. We first construct a natural simple model for price by considering a random walk in which jumps can happen at random times following a counting process N. We then develop a sequence of discrete price models using random walks over point processes. The limit process gives the new price model: Zt = Z₀exp(σBΛt + mΛt), where Λ is the compensator for the counting process N. We note that if N is a Poisson process with intensity 1, then this model coincides with the geometric Brownian motion model for the price. But this new model provides more flexibility as we can choose N to be many other wellknown counting processes. This includes not only homogeneous and inhomogeneous Poisson processes which have deterministic compensators but also Hawkes processes which have stochastic compensators. We also discuss and prove many properties for the process BΛ. For example, we show that BΛ is a continuous square integrable martingale. Moreover, we discuss when BΛ has uncorrelated increments and when it has independent increments. Moreover, we investigate how the BlackScholes pricing formula will change if the price of the risky asset follows this new model when N is an inhomogeneous Poisson process. We show that the usual BlackScholes formula is obtained when the counting process N is a Poisson process with intensity 1.
Show less  Date Issued
 2019
 Identifier
 2019_Spring_Salehy_fsu_0071E_15152
 Format
 Thesis
 Title
 Univariate and Multivariate Volatility Models for Portfolio Value at Risk.
 Creator

Xiao, Jingyi, Niu, Xufeng, Ökten, Giray, Wu, Wei, Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

In modern day financial risk management, modeling and forecasting stock return movements via their conditional volatilities, particularly predicting the Value at Risk (VaR), became increasingly more important for a healthy economical environment. In this dissertation, we evaluate and compare two main families of models for the conditional volatilities  GARCH and Stochastic Volatility (SV)  in terms of their VaR prediction performance of 5 major US stock indices. We calculate GARCHtype...
Show moreIn modern day financial risk management, modeling and forecasting stock return movements via their conditional volatilities, particularly predicting the Value at Risk (VaR), became increasingly more important for a healthy economical environment. In this dissertation, we evaluate and compare two main families of models for the conditional volatilities  GARCH and Stochastic Volatility (SV)  in terms of their VaR prediction performance of 5 major US stock indices. We calculate GARCHtype model parameters via Quasi Maximum Likelihood Estimation (QMLE) while for those of SV we employ MCMC with Ancillary Sufficient Interweaving Strategy. We use the forecast volatilities corresponding to each model to predict the VaR of the 5 indices. We test the predictive performances of the estimated models by a twostage backtesting procedure and then compare them via the Lopez loss function. Results of this dissertation indicate that even though it is more computational demanding than GARCHtype models, SV dominates them in forecasting VaR. Since financial volatilities are moving together across assets and markets, it becomes apparent that modeling the volatilities in a multivariate framework of modeling is more appropriate. However, existing studies in the literature do not present compelling evidence for a strong preference between univariate and multivariate models. In this dissertation we also address the problem of forecasting portfolio VaR via multivariate GARCH models versus univariate GARCH models. We construct 3 portfolios with stock returns of 3 major US stock indices, 6 major banks and 6 major technical companies respectively. For each portfolio, we model the portfolio conditional covariances with GARCH, EGARCH and MGARCHBEKK, MGARCHDCC, and GOGARCH models. For each estimated model, the forecast portfolio volatilities are further used to calculate (portfolio) VaR. The ability to capture the portfolio volatilities is evaluated by MAE and RMSE; the VaR prediction performance is tested through a twostage backtesting procedure and compared in terms of the loss function. The results of our study indicate that even though MGARCH models are better in predicting the volatilities of some portfolios, GARCH models could perform as well as their multivariate (and computationally more demanding) counterparts.
Show less  Date Issued
 2019
 Identifier
 2019_Spring_Xiao_fsu_0071E_15172
 Format
 Thesis
 Title
 Parameter Sensitive Feature Selection for Learning on Large Datasets.
 Creator

Gramajo, Gary, Barbu, Adrian G. (Adrian Gheorghe), Piyush, Kumar, Huffer, Fred W. (Fred William), She, Yiyuan, Zhang, Jinfeng, Florida State University, College of Arts and...
Show moreGramajo, Gary, Barbu, Adrian G. (Adrian Gheorghe), Piyush, Kumar, Huffer, Fred W. (Fred William), She, Yiyuan, Zhang, Jinfeng, Florida State University, College of Arts and Sciences, Department of Statistics
Show less  Abstract/Description

Though there are many feature selection methods for learning, they might not scale well to very large datasets, such as those generated in computer vision data. Furthermore, it can be beneficial to capture and model the variability inherent to data such as face detection where a plethora of face poses (i.e. parameters) are possible. We propose a parameter sensitive learning method that can learn effectively on datasets that can be prohibitively large. Our contributions are the following....
Show moreThough there are many feature selection methods for learning, they might not scale well to very large datasets, such as those generated in computer vision data. Furthermore, it can be beneficial to capture and model the variability inherent to data such as face detection where a plethora of face poses (i.e. parameters) are possible. We propose a parameter sensitive learning method that can learn effectively on datasets that can be prohibitively large. Our contributions are the following. First, we propose an efficient feature selection algorithm that optimizes a differentiable loss with sparsity constraints. We note that any differentiable loss can be used and will vary depending on the application. The iterative algorithm alternates parameter updates with tightening the sparsity constraints by gradually removing variables based on the coefficient magnitudes and a schedule. Second, we show how to train a single parameter sensitive classifier that models the wide range of class variability. The sole classifier is important since this reduces the amount of data necessary for training compared to methods where multiple classifiers are trained for each parameter value. Third, we show how to use nonlinear univariate response functions to obtain a nonlinear decision boundary with feature selection; an important characteristic since the separation of classes in real world datasets is very challenging. Fourth, we show it is possible to mine hard negatives with feature selection, though it is more difficult. This is vital in computer vision data where 10^5 training examples can be generated per image. Fifth, we propose an approach to perform face detection using a 3D model on a number of face keypoints. We modify binary face features from the literature (generated using random forests) to fit into our 3D model framework. Experiments on detecting the face keypoints and on face detection using the proposed 3D models and modified face features show that the feature selection dramatically improve performance and come close to the state of the art on two standard datasets for face detection . We also apply our parameter sensitive learning method with feature selection to detect malicious websites, a dataset with approximately 2.4 million websites and 3.3 million features per website. We outperform other batch algorithms and obtain results close to a high performing online algorithm but using far fewer features.
Show less  Date Issued
 2015
 Identifier
 FSU_migr_etd9604
 Format
 Thesis
 Title
 Tools for Statistical Analysis on Shape Spaces of ThreeDimensional Object.
 Creator

Xie, Qian, Srivastava, Anuj, Klassen, E. (Eric), Huffer, Fred W. (Fred William), Wu, Wei, Zhang, Jinfeng, Florida State University, College of Arts and Sciences, Department of...
Show moreXie, Qian, Srivastava, Anuj, Klassen, E. (Eric), Huffer, Fred W. (Fred William), Wu, Wei, Zhang, Jinfeng, Florida State University, College of Arts and Sciences, Department of Statistics
Show less  Abstract/Description

With the increasing popularity of information technology, especially electronic imaging techniques, large amount of high dimensional data such as 3D shapes become pervasive in science, engineering and even people's daily life, in the recent years. Though the data quantity is huge, the extraction of relevant knowledge on those data is still limited. How to understand data in a meaningful way is generally an open problem. The specific challenges include finding adequate mathematical...
Show moreWith the increasing popularity of information technology, especially electronic imaging techniques, large amount of high dimensional data such as 3D shapes become pervasive in science, engineering and even people's daily life, in the recent years. Though the data quantity is huge, the extraction of relevant knowledge on those data is still limited. How to understand data in a meaningful way is generally an open problem. The specific challenges include finding adequate mathematical representations of data and designing proper algorithms to process them. The existing tools for analyzing highdimensional data, including 3D shape data, are found to be insufficient as they usually suffer from many factors, such as misalignments, noise, and clutter. This thesis attempts to develop a framework for processing, analyzing and understanding highdimensional data, especially 3D shapes, by proposing a set of statistical tools including theory, algorithms and optimization applied to practical problems. In particular, the following aspects of shape analysis are considered: 1. A framework adopting the SRNF representation, based on parallel transport of deformations across surfaces in the shape space, leads to statistical analysis on shape data. Three main analyses are conducted under this framework: (1) computing geodesics when either two end surfaces or the starting surface and an initial deformation are given; (2) parallel transporting deformation across surfaces; and (3) sampling random surfaces. 2. Computational efficiency plays an important role in performing statistical shape analysis on large datasets of 3D objects. To speed up the previous method, a framework with numerical solution is introduced by approximating the inverse mapping, and it reduces the computational cost by an order of magnitude. 3. The geometrical and morphological information, or their shapes, of 3D objects can be analyzed explicitly using boundaries extracted from original image scans. An alternative idea is to consider variability in shapes directly from their embedding images. A novel framework is proposed to unify three important tasks, registering, comparing and modeling images. 4. Finally, the spatial deformations learned from registering images are modeled using the GRID based decomposition. This specific model provides a way to decompose a large deformation into local and fundamental ones so that shape differences between images are easily interpretable. We conclude this thesis with conclusions drawn in this research and discuss potential future directions of statistical shape analysis in the last chapter, both from methodological and application aspects.
Show less  Date Issued
 2015
 Identifier
 FSU_migr_etd9495
 Format
 Thesis
 Title
 A Framework for Comparing Shape Distributions.
 Creator

Henning, Wade, Srivastava, Anuj, Alamo, Ruﬁna G., Huﬀer, Fred W. (Fred William), Wu, Wei, Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

The problem of comparisons of shape populations is present in many branches of science, including nanomanufacturing, medical imaging, particle analysis, fisheries, seed science, and computer vision. Researchers in these fields have traditionally characterized the profiles in these sets using combinations of scalar valued descriptor features, like aspect ratio or roughness, whose distributions are easy to compare using classical statistics. However, there is a desire in this community for a...
Show moreThe problem of comparisons of shape populations is present in many branches of science, including nanomanufacturing, medical imaging, particle analysis, fisheries, seed science, and computer vision. Researchers in these fields have traditionally characterized the profiles in these sets using combinations of scalar valued descriptor features, like aspect ratio or roughness, whose distributions are easy to compare using classical statistics. However, there is a desire in this community for a single comprehensive feature that uniquely defines these profiles. The shape of the profile itself is such a feature. Shape features have traditionally been studied as individuals, and comparing distributions underlying sets of shapes is challenging. Since the data comes in the form of samples from shape populations, we use kernel methods to estimate underlying shape densities. We then take a metric approach to define a proper distance, termed the FisherRao distance, to quantify differences between any two densities. This distance can be used for clustering, classification and other types of statistical modeling; however, this dissertation focuses on comparing shape populations as a classical twosample hypothesis test with populations characterized by respective probability densities on shape space. Since we are interested in the shapes of planar closed curves and the space of such curves is infinite dimensional, there are some theoretical issues in defining and estimating densities on this space. We therefore use a spherical multidimensional scaling algorithm to project shape distributions to the unit twosphere, and this allows us to use a von MisesFisher kernel for density estimation. The estimated densities are then compared using the FisherRao distance, which, in turn, is estimated using Monte Carlo methods. This distance estimate is used as a test statistic for the twosample hypothesis test mentioned above. We use a bootstrap approach to perform the test and to evaluate population classification performance. We demonstrate these ideas using applications from industrial and chemical engineering.
Show less  Date Issued
 2014
 Identifier
 FSU_migr_etd9185
 Format
 Thesis
 Title
 Within Study Dependence in MetaAnalysis: Comparison of GLS Method and Multilevel Approaches.
 Creator

Lee, Seungjin, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Paek, Insu, Yang, Yanyun, Florida State University, College of Education, Department of Educational Psychology...
Show moreLee, Seungjin, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Paek, Insu, Yang, Yanyun, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

Multivariate metaanalysis methods typically assume the dependence of effect sizes. One type of experimentaldesign study that generates dependent effect sizes is the multipleendpoint study. While the generalized least squares (GLS) approach requires the sample covariance between outcomes within studies to deal with the dependence of the effect sizes, the univariate threelevel approach does not require the sample covariance to analyze such multivariate effectsize data. Considering that it...
Show moreMultivariate metaanalysis methods typically assume the dependence of effect sizes. One type of experimentaldesign study that generates dependent effect sizes is the multipleendpoint study. While the generalized least squares (GLS) approach requires the sample covariance between outcomes within studies to deal with the dependence of the effect sizes, the univariate threelevel approach does not require the sample covariance to analyze such multivariate effectsize data. Considering that it is rare that primary studies report the sample covariance, if the two approaches produce the same estimates and corresponding standard errors, the univariate threelevel model approach could be an alternative to the GLS approach. The main purpose of this dissertation was to compare these two approaches under the randomeffects model for synthesizing standardized mean differences in multipleendpoints experimental designs using a simulation study. Two data sets were generated under the randomeffects model: one set with two outcomes and the other set with five outcomes. The simulation study in this dissertation found that the univariate threelevel model yielded the appropriate parameter estimates and their standard errors corresponding to those in the multivariate metaanalysis using the GLS approach.
Show less  Date Issued
 2014
 Identifier
 FSU_migr_etd9205
 Format
 Thesis
 Title
 Estimating Sensitivities of Exotic Options Using Monte Carlo Methods.
 Creator

Yuan, Wei, Ökten, Giray, Kim, Kyounghee, Huffer, Fred W. (Fred William), Kercheval, Alec N., Nichols, Warren, Florida State University, College of Arts and Sciences, Department...
Show moreYuan, Wei, Ökten, Giray, Kim, Kyounghee, Huffer, Fred W. (Fred William), Kercheval, Alec N., Nichols, Warren, Florida State University, College of Arts and Sciences, Department of Mathematics
Show less  Abstract/Description

In this dissertation, methods of estimating the sensitivity of complex exotic options, including options written on multiple assets, and have discontinuous payoffs, are investigated. The calculation of the sensitivities (Greeks) is based on the finite difference method, pathwise method, likelihood ratio method and kernel method, via Monte Carlo or quasiMonte Carlo simulation. Direct Monte Carlo estimators for various sensitivities of weather derivatives and mountain range options are given....
Show moreIn this dissertation, methods of estimating the sensitivity of complex exotic options, including options written on multiple assets, and have discontinuous payoffs, are investigated. The calculation of the sensitivities (Greeks) is based on the finite difference method, pathwise method, likelihood ratio method and kernel method, via Monte Carlo or quasiMonte Carlo simulation. Direct Monte Carlo estimators for various sensitivities of weather derivatives and mountain range options are given. The numerical results show that the pathwise method outperforms other methods when the payoff function is Lipschitz continuous. The kernel method and the central finite difference methods are competitive when the payoff function is discontinuous.
Show less  Date Issued
 2015
 Identifier
 FSU_migr_etd9528
 Format
 Thesis
 Title
 MetaAnalysis of Factor Analyses: Comparison of Univariate and Multivariate Approaches Using Correlation Matrices and Factor Loadings.
 Creator

Cho, Kyunghwa, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Paek, Insu, Yang, Yanyun, Florida State University, College of Education, Department of Educational Psychology...
Show moreCho, Kyunghwa, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Paek, Insu, Yang, Yanyun, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

Currently, more sophisticated techniques such as factor analyses are frequently applied in primary research thus may need to be metaanalyzed. This topic has been given little attention in the past due to its complexity. Because factor analysis is becoming more popular in research in many areas including education, social work, social science, and so on, the study of methods for the metaanalysis of factor analyses is also becoming more important. The first main purpose of this dissertation...
Show moreCurrently, more sophisticated techniques such as factor analyses are frequently applied in primary research thus may need to be metaanalyzed. This topic has been given little attention in the past due to its complexity. Because factor analysis is becoming more popular in research in many areas including education, social work, social science, and so on, the study of methods for the metaanalysis of factor analyses is also becoming more important. The first main purpose of this dissertation is to compare the results of seven different approaches to doing metaanalysis of confirmatory factor analyses. Specifically, five approaches are based on univariate metaanalysis methods. The next two approaches use multivariate metaanalysis to obtain the results of factor loadings and the standard errors of factor loadings. The results from each approach are compared. Given the fact that factor analyses are commonly used in many areas, the second purpose of this dissertation is to explore the appropriate approach or approaches to use for the metaanalysis of factor analyses, especially Confirmatory Factor Analysis (CFA). When the average sample size was small, the results of IRD, WMC, WMFL, and GLSMFL approaches showed better performance than those of UMC, MFL, and GLSMC approaches to estimating parameters. With large average sample sizes (larger than 150), the performance to estimate the parameters across all seven approaches seemed to be similar in this dissertation. Based on my simulation results, researchers who want to conduct metaanalytic confirmatory factor analysis can apply any of these approaches to synthesize the results from primary studies it their studies have n > 150.
Show less  Date Issued
 2015
 Identifier
 FSU_migr_etd9570
 Format
 Thesis
 Title
 Exponential Convergence Fourier Method and Its Application to Option Pricing with Lévy Processes.
 Creator

Gu, Fangxi, Nolder, Craig, Huffer, Fred W. (Fred William), Kercheval, Alec N., Nichols, Warren D., Ökten, Giray, Florida State University, College of Arts and Sciences,...
Show moreGu, Fangxi, Nolder, Craig, Huffer, Fred W. (Fred William), Kercheval, Alec N., Nichols, Warren D., Ökten, Giray, Florida State University, College of Arts and Sciences, Department of Mathematics
Show less  Abstract/Description

Option pricing by the Fourier method has been popular for the past decade, many of its applications to Lévy processes has been applied especially for European options. This thesis focuses on exponential convergence Fourier method and its application to discrete monitoring options and Bermudan options. An alternative payoff truncating method is derived to compare the benchmark Hilbert transform. A general error control framework is derived to keep the Fourier method out of an overflow problem....
Show moreOption pricing by the Fourier method has been popular for the past decade, many of its applications to Lévy processes has been applied especially for European options. This thesis focuses on exponential convergence Fourier method and its application to discrete monitoring options and Bermudan options. An alternative payoff truncating method is derived to compare the benchmark Hilbert transform. A general error control framework is derived to keep the Fourier method out of an overflow problem. Numerical results verify that the alternative payoff truncating sinc method performs better than the benchmark Hilbert transform method under the error control framework.
Show less  Date Issued
 2016
 Identifier
 FSU_FA2016_Gu_fsu_0071E_13579
 Format
 Thesis
 Title
 Sparse Feature and Element Selection in HighDimensional Vector Autoregressive Models.
 Creator

Huang, Xue, Niu, Xufeng, She, Yiyuan, Cheng, Yingmei, Huffer, Fred W. (Fred William), Wu, Wei, Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

This thesis is to identify the underlying structures of multivariate time series and propose a methodology to construct predictive VAR models. Due to the complexity of high dimensions in multivariate time series, forecasting a target series with many predictors in VAR models poses a challenge in statistical learning and modeling. The quadratically increasing dimension of parameter space, which is known as "curse of dimensionality" poses considerable challenges to multivariate time series...
Show moreThis thesis is to identify the underlying structures of multivariate time series and propose a methodology to construct predictive VAR models. Due to the complexity of high dimensions in multivariate time series, forecasting a target series with many predictors in VAR models poses a challenge in statistical learning and modeling. The quadratically increasing dimension of parameter space, which is known as "curse of dimensionality" poses considerable challenges to multivariate time series models. Meanwhile, there are two facts involved in reducing dimensions in multivariate time series: first, some nuisance time series exist and better to be removed, second a target time series is typically driven by few dependent elements constructed from some indices. To address these challenge and facts, our approach is to reduce both the dimensions of the series and the features involved in each series simultaneously. As a result, the original high dimensional structure can be modeled using a lower dimensional time series, and subsequently the forecasting performance will be improved. The methodology we introduced in this work is called Sparse Feature and Element Selection (SFES). It employs a "L1 + group L1" penalty to conduct group selection and variable selection within each group simultaneously. Our contributions in this thesis are twofolds. First, the doublyconstrained regularization in SFES is a convex mathematical problem, and we optimize it using a fast but simpletoimplement algorithm. We evaluate this algorithm with a largescale dataset and theoretically prove that it has guaranteed strict iterative convergence and global optimality. Second, we theoretically present nonasymptotic results based on combined statistical and computational analysis. A sharp oracle inequality is proved to reveal its power in predictive learning. We compare SFES with the related work of Sparse Group Lasso (SGL) to show that the proposed method is both computationally efficient and theoretically justified. Experiments using simulation data and realworld macroeconomic time series data are conducted to demonstrate the efficiency and efficacy of the proposed SFES in practice.
Show less  Date Issued
 2016
 Identifier
 FSU_FA2016_Huang_fsu_0071E_13659
 Format
 Thesis
 Title
 Evaluation of Measurement Invariance in IRT Using Limited Information Fit Statistics/Indices: A Monte Carlo Study.
 Creator

Cui, Mengyao, Yang, Yanyun, Paek, Insu, Huffer, Fred W. (Fred William), Becker, Betsy Jane, Binici, Salih, Florida State University, College of Education, Department of...
Show moreCui, Mengyao, Yang, Yanyun, Paek, Insu, Huffer, Fred W. (Fred William), Becker, Betsy Jane, Binici, Salih, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

Measurement invariance analysis is important when test scores are used to make a groupwise comparison. Multiplegroup IRT modeling is one of the commonly used methods for measurement invariance examination. One essential step in the multiplegroup modeling method is the evaluation of overall modeldata fit. A family of limited information fit statistics has been recently developed for assessing the overall modeldata fit in IRT. Previous studies evaluated the performance of limited...
Show moreMeasurement invariance analysis is important when test scores are used to make a groupwise comparison. Multiplegroup IRT modeling is one of the commonly used methods for measurement invariance examination. One essential step in the multiplegroup modeling method is the evaluation of overall modeldata fit. A family of limited information fit statistics has been recently developed for assessing the overall modeldata fit in IRT. Previous studies evaluated the performance of limited information fit statistics using singlegroup data, and found that these fit statistics performed better than the traditional full information fit statistics when data were sparse. However, no study has investigated the performance of the limited information fit statistics within the multiplegroup modeling framework. This study aims to examine the performance of the limited information fit statistic (M₂) and M₂based corresponding descriptive fit indices in conducting measurement invariance analysis within the multiplegroup IRT framework. A Monte Carlo study was conducted to examine sampling distributions of M₂ and M₂based descriptive fit indices, and their sensitivities to lack of measurement invariance under various conditions. The manipulated factors included sample sizes, model types, dimensionality, types and numbers of DIF items, and latent trait distributions. Results showed that the M₂ followed an approximately chisquare distribution when the model was correctly specified, as expected. The type I error rates of M₂ were reasonable under large sample sizes (1000/2000). When the model was misspecified, the power of M₂ was a function of sample size and the number of DIF items. For example, the power of M₂ for rejecting the U2PL Scalar Model increased from 29.2% to 99.9% when the number of uniform DIF items increased from one to six, given the sample sizes of 1000/2000. With six uniform DIF items (30% of the studied items), the power of increased from 42.4% to 99.9% when sample sizes changed from 250/500 to 1000/2000. When the difference in M₂(ΔM₂) was used to compare two correctly specified nested models, the sampling distribution of ΔM₂ appeared to be apart from the reference chisquare distribution at both tails, especially under small sample sizes. The type I error rates of the ΔM₂ test became closer to the expectation when sample sizes increased. For example, both Metric and Configural Models were correctly specified when the test included no DIF item. Given the alpha level of .05, the type I error rates of for the comparsion between the Metric and Configural Model were slightly inflated with n=250/500 (8.72%), and became closer to the alpha level with n=1000/2000 (5.3%). When at least one of the models was misspecified, the power of increased when the number of DIF items or sample sizes became larger. For example, the Metric Model was misspecified when nonuniform DIF item existed. Given sample sizes of 1000/2000 and alpha level of .05, the power of ΔM₂ for the comparison between the Metric and Configural Model increased from 52.55 % to 99.39% when the number of nonuniform DIF items changes from one to six. With one nonuniform DIF item in the test, the power of ΔM₂ was only 17.05% given the alpha level of .05 and sample sizes of 250/500, but increased to 52.55% given the sample sizes of 1000/2000. The descriptive fit indices and their differences between nested models were also affected by the number of DIF items. When there was no DIF item, all fit indices indicated good modeldata fit. The differences of the five fit indices between nested models were all very small (<.008) across different sample sizes. When DIF items existed, the means of descriptive fit indices, and their differences between nested models increased when number of DIF items increased. The finding from this study provided some suggestions about the implementation of the limited information fit statistics/indices in measurement invariance analysis within the multiplegroup IRT framework.
Show less  Date Issued
 2016
 Identifier
 FSU_FA2016_Cui_fsu_0071E_13537
 Format
 Thesis
 Title
 Investigating the ChiSquareBased ModelFit Indexes for WLSMV and ULSMV Estimators.
 Creator

Xia, Yan, Yang, Yanyun, Huffer, Fred W. (Fred William), Almond, Russell G., Becker, Betsy Jane, Paek, Insu, Florida State University, College of Education, Department of...
Show moreXia, Yan, Yang, Yanyun, Huffer, Fred W. (Fred William), Almond, Russell G., Becker, Betsy Jane, Paek, Insu, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

In structural equation modeling (SEM), researchers use the model chisquare statistic and modelfit indexes to evaluate modeldata fit. Root mean square error of approximation (RMSEA), comparative fit index (CFI), and TuckerLewis index (TLI) are widely applied modelfit indexes. When data are ordered and categorical, the most popular estimator is the diagonally weighted least squares (DWLS) estimator. Robust corrections have been proposed to adjust the uncorrected chisquare statistic from...
Show moreIn structural equation modeling (SEM), researchers use the model chisquare statistic and modelfit indexes to evaluate modeldata fit. Root mean square error of approximation (RMSEA), comparative fit index (CFI), and TuckerLewis index (TLI) are widely applied modelfit indexes. When data are ordered and categorical, the most popular estimator is the diagonally weighted least squares (DWLS) estimator. Robust corrections have been proposed to adjust the uncorrected chisquare statistic from DWLS so that its first and second order moments are in alignment with the target central chisquare distribution under correctly specified models. DWLS with such a correction is called the mean and varianceadjusted weighted least squares (WLSMV) estimator. An alternative to WLSMV is the meanand varianceadjusted unweighted least squares (ULSMV) estimator, which has been shown to perform as well as, or slightly better than WLSMV. Because the chisquare statistic is corrected, the chisquarebased RMSEA, CFI, and TLI are thus also corrected by replacing the uncorrected chisquare statistic with the robust chisquare statistic. The robust model fit indexes calculated in such a way are named as the populationcorrected robust (PR) model fit indexes following BrosseauLiard, Savalei, and Li (2012). The PR model fit indexes are currently reported in almost every application when WLSMV or ULSMV is used. Nevertheless, previous studies have found the PR model fit indexes from WLSMV are sensitive to several factors such as sample sizes, model sizes, and thresholds for categorization. The first focus of this dissertation is on the dependency of model fit indexes on the thresholds for ordered categorical data. Because the weight matrix in the WLSMV fit function and the correction factors for both WLSMV and ULSMV include the asymptotic variances of thresholds and polychoric correlations, the model fit indexes are very likely to depend on the thresholds. The dependency of model fit indexes on the thresholds is not a desirable property, because when the misspecification lies in the factor structures (e.g., cross loadings are ignored or two factors are considered as a single factor), model fit indexes should reflect such misspecification rather than the threshold values. As alternatives to the PR model fit indexes, BrosseauLiard et al. (2012), BrosseauLiard and Savalei (2014), and Li and Bentler (2006) proposed the samplecorrected robust (SR) model fit indexes. The PR fit indexes are found to converge to distorted asymptotic values, but the SR fit indexes converge to their definitions asymptotically. However, the SR model fit indexes were proposed for continuous data, and have been neither investigated nor implemented in SEM software when WLSMV and ULSMV are applied. This dissertation thus investigates the PR and SR model fit indexes for WLSMV and ULSMV. The first part of the simulation study examines the dependency of the model fit indexes on the thresholds when the model misspecification results from omitting crossloadings or collapsing factors in confirmatory factor analysis. The study is conducted on extremely large computergenerated datasets in order to approximate the asymptotic values of model fit indexes. The results find that only the SR fit indexes from ULSMV are independent of the population threshold values, given the other design factors. The PR fit indexes from ULSMV, and the PR and SR fit indexes from WLSMV are influenced by thresholds, especially when data are binary and the hypothesized model is greatly misspecified. The second part of the simulation varies the sample sizes from 100 to 1000 to investigate whether the SR fit indexes under finite samples are more accurate estimates of the defined values of RMSEA, CFI, and TLI, compared with the uncorrected model fit indexes without robust correction and the PR fit indexes. Results show that the SR fit indexes are the more accurate in general. However, when the thresholds are different across items, data are binary, and sample size is less than 500, all versions of these indexes can be very inaccurate. In such situations, larger sample sizes are needed. In addition, the conventional cutoffs developed from continuous data with maximum likelihood (e.g., RMSEA < .06, CFI > .95, and TLI > .95; Hu & Bentler, 1999) have been applied to WLSMV and ULSMV regardless of the arguments against such a practice (e.g., Marsh, Hau, & Wen, 2004). For comparison purposes, this dissertation reports the RMSEA, CFI, and TLI based on continuous data using maximum likelihood before the variables are categorized to create ordered categorical data. Results show that the model fit indexes from maximum likelihood are very different from those from WLSMV and ULSMV, suggesting that the conventional rules should not be applied to WLSMV and ULSMV.
Show less  Date Issued
 2016
 Identifier
 FSU_2016SU_Xia_fsu_0071E_13379
 Format
 Thesis
 Title
 Loglinear Model as a DIF Detection Method for Dichotomous and Polytomous Items and Its Comparison with Other Observed Score Matching DIF Methods.
 Creator

Yesiltas, Gonca, Paek, Insu, Huffer, Fred W. (Fred William), Becker, Betsy Jane, Almond, Russell G., Florida State University, College of Education, Department of Educational...
Show moreYesiltas, Gonca, Paek, Insu, Huffer, Fred W. (Fred William), Becker, Betsy Jane, Almond, Russell G., Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

DIF detection methods identify the difference between the performances of subgroups when the subgroups are matched by examinees' ability level or a proxy variable, such as total test score (Holland & Wainer, 1993). Loglinear Models (LLM) method is one of the DIF detection methods. This method was first introduced by Mellenbergh (1982) to investigate the relationship among item responses, subgroups, and categorized total test score in terms of DIF detection. This study examined the...
Show moreDIF detection methods identify the difference between the performances of subgroups when the subgroups are matched by examinees' ability level or a proxy variable, such as total test score (Holland & Wainer, 1993). Loglinear Models (LLM) method is one of the DIF detection methods. This method was first introduced by Mellenbergh (1982) to investigate the relationship among item responses, subgroups, and categorized total test score in terms of DIF detection. This study examined the performance of LLM as a DIF detection method for dichotomous items and polytomous items. LLM method was compared with MantelHaenszsel (MH) and logistic regression (LR) methods to detect uniform DIF and with LR to detect nonuniform DIF in dichotomous item response data. MH was not included in nonuniform DIF detection, because, the previous studies indicated that it is not able to detect nonuniform DIF (Narayanon & Swaminathan, 1996; Uttaro & Milsap, 1994). In addition, LLM was compared with Mantel, generalized MantelHaenszsel (GMH), ordinal logistic regression (OLR), logistic discriminate function analysis (LDFA) methods in polytomous item response data. For this purpose, both simulation study and empirical study were conducted under various sample sizes, ability mean differences (impact) and item parameters. Since the previous studies did not investigate the effect of ability mean differences on DIF detection with LLM, this study also focused on the effect of ability mean differences between subgroups. This study found that MH was better to detect uniform DIF when LR and LLM indicated equally well performance on uniform and nonuniform DIF detection. In Addition, GMH and LLM performed better than Mantel, OLR, and LDFA for the polytomous item response data.
Show less  Date Issued
 2016
 Identifier
 FSU_2016SP_Yesiltas_fsu_0071E_13119
 Format
 Thesis
 Title
 The Impact of Competition on Elephant Musth Strategies: A Gametheoretic Model.
 Creator

Wyse, J. Maxwell (John Maxwell), MestertonGibbons, Mike, Huffer, Fred W. (Fred William), Hurdal, Monica K., Cogan, Nicholas G., Florida State University, College of Arts and...
Show moreWyse, J. Maxwell (John Maxwell), MestertonGibbons, Mike, Huffer, Fred W. (Fred William), Hurdal, Monica K., Cogan, Nicholas G., Florida State University, College of Arts and Sciences, Department of Mathematics
Show less  Abstract/Description

Mature male African elephants are known to periodically enter a temporary state of heightened aggression called "musth," often linked with increased androgens, particularly testosterone. Sexually mature males are capable of entering musth at any time of year, and will often travel long distances to find estrous females. When two musth bulls or two nonmusth bulls encounter one another, the agonistic interaction is usually won by the larger male. When a smaller musth bull encounters a larger...
Show moreMature male African elephants are known to periodically enter a temporary state of heightened aggression called "musth," often linked with increased androgens, particularly testosterone. Sexually mature males are capable of entering musth at any time of year, and will often travel long distances to find estrous females. When two musth bulls or two nonmusth bulls encounter one another, the agonistic interaction is usually won by the larger male. When a smaller musth bull encounters a larger nonmusth bull, however, the smaller musth male can win. The relative mating success of musth males is due partly to this fighting advantage, and partly to estrous females' general preference for musth males. Though musth behavior has long been observed and documented, the evolutionary advantages of musth remain poorly understood. Here we develop a gametheoretic model of male musth behavior which assumes musth duration as a parameter, and distributions of small, medium and large musth males are predicted in both time and space. The predicted results are similar to the observed timing strategies in the Amboseli National Park elephant population. We discuss small male musth behavior, musthestrus coincidence, the effects of estrous female spatial heterogeneity on musth timing, conservation applications, the assumptions underpinning the model and possible modifications to the model for the purpose of determining musth duration.
Show less  Date Issued
 2017
 Identifier
 FSU_2017SP_Wyse_fsu_0071E_13713
 Format
 Thesis
 Title
 Random Sobol' Sensitivity Analysis and Model Robustness.
 Creator

Mandel, David, Ökten, Giray, Hussaini, M. Yousuff, Huffer, Fred W. (Fred William), Kercheval, Alec N., Fahim, Arash, Florida State University, College of Arts and Sciences,...
Show moreMandel, David, Ökten, Giray, Hussaini, M. Yousuff, Huffer, Fred W. (Fred William), Kercheval, Alec N., Fahim, Arash, Florida State University, College of Arts and Sciences, Department of Mathematics
Show less  Abstract/Description

This work develops both the theoretical foundation and the practical application of random Sobol' analysis with two goals. The first is to provide a more general and accommodating approach to global sensitivity analysis, in which the parameter distribution themselves contain uncertainty, and hence the sensitivity results are random quantities as well. The framework for this approach is motivated by empirical evidence of such behavior, and examples of this behavior in interest rate and...
Show moreThis work develops both the theoretical foundation and the practical application of random Sobol' analysis with two goals. The first is to provide a more general and accommodating approach to global sensitivity analysis, in which the parameter distribution themselves contain uncertainty, and hence the sensitivity results are random quantities as well. The framework for this approach is motivated by empirical evidence of such behavior, and examples of this behavior in interest rate and temperature modeling are provided. The second goal is to compare competing models on their robustness, a notion developed and defined to provide a quantitative solution to model selection based on model uncertainty and sensitivity
Show less  Date Issued
 2017
 Identifier
 FSU_2017SP_Mandel_fsu_0071E_13682
 Format
 Thesis
 Title
 TimeVarying Mixture Models for Financial Risk Management.
 Creator

Zhang, Shuguang, Niu, Xufeng, Cheng, Yingmei, Huffer, Fred W. (Fred William), Tao, Minjing, Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

Motivated by understanding the devastating financial crisis in 2008 that was partially caused by underestimation of financial risk, we propose a class of timevarying mixture models for risk analysis and management. There are various metrics for financial risk including value at risk (VaR), expected shortfall, expected / unexpected loss, etc. In this study we focus on VaR and one commonly used method to estimate VaR is the VarianceCovariance method, in which normal distribution is usually...
Show moreMotivated by understanding the devastating financial crisis in 2008 that was partially caused by underestimation of financial risk, we propose a class of timevarying mixture models for risk analysis and management. There are various metrics for financial risk including value at risk (VaR), expected shortfall, expected / unexpected loss, etc. In this study we focus on VaR and one commonly used method to estimate VaR is the VarianceCovariance method, in which normal distribution is usually assumed for asset returns that may underestimate the real risk. To address this issue, in this study we propose a series of twocomponent mixture models  one component is normal distribution and the other is a fattailed distribution such as Cauchy distribution, student's tdistribution or Gumbel distribution. Instead of assuming distribution parameters and weights to be constant, we allow them to change over time which guarantees exibility of our models. Monte Carlo ExpectationMaximization method and Monte Carlo maximum likelihood estimation were used for parameter estimation. Simulation studies are conducted and the models are applied in stock market price data.
Show less  Date Issued
 2016
 Identifier
 FSU_2016SP_Zhang_fsu_0071E_13150
 Format
 Thesis
 Title
 Elastic Functional Principal Component Analysis for Modeling and Testing of Functional Data.
 Creator

Duncan, Megan, Srivastava, Anuj, Klassen, E., Huffer, Fred W., Wu, Wei, Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

Statistical analysis of functional data requires tools for comparing, summarizing and modeling observed functions as elements of a function space. A key issue in Functional Data Analysis (FDA) is the presence of the phase variability in the observed data. A successful statistical model of functional data has to account for the presence of phase variability. Otherwise the ensuing inferences can be inferior. Recent methods for FDA include steps for phase separation or functional alignment. For...
Show moreStatistical analysis of functional data requires tools for comparing, summarizing and modeling observed functions as elements of a function space. A key issue in Functional Data Analysis (FDA) is the presence of the phase variability in the observed data. A successful statistical model of functional data has to account for the presence of phase variability. Otherwise the ensuing inferences can be inferior. Recent methods for FDA include steps for phase separation or functional alignment. For example, Elastic Functional Principal Component Analysis (Elastic FPCA) uses the strengths of Functional Principal Component Analysis (FPCA), along with the tools from Elastic FDA, to perform joint phaseamplitude separation and modeling. A related problem in FDA is to quantify and test for the amount of phase in a given data. We develop two types of hypothesis tests for testing the significance of phase variability: a metricbased approach and a modelbased approach. The metricbased approach treats phase and amplitude as independent components and uses their respective metrics to apply the FriedmanRafsky Test, Schilling's Nearest Neighbors, and Energy Test to test the differences between functions and their amplitudes. In the modelbased test, we use Concordance Correlation Coefficients as a tool to quantify the agreement between functions and their reconstructions using FPCA and Elastic FPCA. We demonstrate this framework using a number of simulated and real data, including weather, tecator, and growth data.
Show less  Date Issued
 2018
 Identifier
 2018_Sp_Duncan_fsu_0071E_14470
 Format
 Thesis
 Title
 Elastic Functional Regression Model.
 Creator

Ahn, Kyungmin, Srivastava, Anuj, Klassen, E., Wu, Wei, Huffer, Fred W., Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

Functional variables serve important roles as predictors in a variety of pattern recognition and vision applications. Focusing on a specific subproblem, termed scalaronfunction regression, most current approaches adopt the standard L2 inner product to form a link between functional predictors and scalar responses. These methods may perform poorly when predictor functions contain nuisance phase variability, i.e., predictors are temporally misaligned due to noise. While a simple solution...
Show moreFunctional variables serve important roles as predictors in a variety of pattern recognition and vision applications. Focusing on a specific subproblem, termed scalaronfunction regression, most current approaches adopt the standard L2 inner product to form a link between functional predictors and scalar responses. These methods may perform poorly when predictor functions contain nuisance phase variability, i.e., predictors are temporally misaligned due to noise. While a simple solution could be to prealign predictors as a preprocessing step, before applying a regression model, this alignment is seldom optimal from the perspective of regression. In this dissertation, we propose a new approach, termed elastic functional regression, where alignment is included in the regression model itself, and is performed in conjunction with the estimation of other model parameters. This model is based on a normpreserving warping of predictors, not the standard time warping of functions, and provides better prediction in situations where the shape or the amplitude of the predictor is more useful than its phase. We demonstrate the effectiveness of this framework using simulated and real data.
Show less  Date Issued
 2018
 Identifier
 2018_Sp_Ahn_fsu_0071E_14452
 Format
 Thesis
 Title
 The Comparison of Standard Error Methods in the Marginal Maximum Likelihood Estimation of the TwoParameter Logistic Item Response Model When the Distribution of the Latent Trait Is Nonnormal.
 Creator

Lin, Zhongtian, Paek, Insu, Huffer, Fred W., Becker, Betsy Jane, Yang, Yanyun, Florida State University, College of Education, Department of Educational Psychology and Learning...
Show moreLin, Zhongtian, Paek, Insu, Huffer, Fred W., Becker, Betsy Jane, Yang, Yanyun, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

A Monte Carlo simulation study was conducted to investigate the accuracy of several item parameter standard error (SE) estimation methods in item response theory (IRT) when the marginal maximum likelihood (MML) estimation method was used and the distribution of the underlying latent trait was nonnormal in the twoparameter logistic (2PL) model. The manipulated betweensubject factors were sample size (N), test length (TL), and the shape of the latent trait distribution (Shape). The within...
Show moreA Monte Carlo simulation study was conducted to investigate the accuracy of several item parameter standard error (SE) estimation methods in item response theory (IRT) when the marginal maximum likelihood (MML) estimation method was used and the distribution of the underlying latent trait was nonnormal in the twoparameter logistic (2PL) model. The manipulated betweensubject factors were sample size (N), test length (TL), and the shape of the latent trait distribution (Shape). The withinsubject factor was the SE estimation method, which includes the expected Fisher information method (FIS), the empirical crossproduct method (XPD), the supplementedEM method (SEM), the forward difference method (FDM), the Richardson extrapolation method (REM), and the sandwichtype covariance method (SW). The commercial IRT software flexMIRT was used for item parameter estimation and SE estimation. Results showed that when other factors were hold equal, all of the SE methods studied were apt to produce less accurate SE estimates when the distribution of the underlying trait was positively skewed or positively skewedbimodal, as compared to what they would produce when the distribution was normal. The degree of inaccuracy of each method for an individual item parameter depended on the magnitude of the relevant a and b parameter, and were affected more by the magnitude of the b parameter. On the test level, the overall average performance of the SE methods interact with N, TL, and Shape. The FIS was not viable when TL=40 and was only run when TL=15. For such a short test, it remained to be the “gold standard” as it estimated the SEs most accurately among all the methods, although it requires relatively longer time to run. The XPD method was the least timeconsuming option and it generally performed very well when Shape is normal. However, it tended to produce positively biased results when a short test was paired with a small sample. The SW did not outperform other SE methods when Shape is nonnormal as the theory suggests. The FDM had somewhat larger variations when TL=1500 and TL=3000. The SEM and REM were most accurate among the SE methods in this study and appeared to be a good choice both for normal or nonnormal cases. For each simulated condition, the average shape of the rawscore distribution was presented to help practitioners better infer the shape of the underlying distribution of latent trait when the truth about the latent trait distribution shape is unknown, thereby leading to more informed decisions of SE methods using the results of this study. Implications, limitations and future directions were discussed.
Show less  Date Issued
 2018
 Identifier
 2018_Sp_Lin_fsu_0071E_14423
 Format
 Thesis
 Title
 Volatility Matrix Estimation for HighFrequency Financial Data.
 Creator

Xue, Yang, Tao, Minjing, Cheng, Yingmei, Fendler, Rachel Loveitt, Huffer, Fred W., Niu, Xufeng, Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

Volatility is usually employed to measure the dispersion of asset returns, and it’s widely used in risk analysis and asset management. This first chapter studies a kernelbased spot volatility matrix estimator with preaveraging approach for highfrequency data contaminated by market microstructure noise. When the sample size goes to infinity and the bandwidth vanishes, we show that our estimator is consistent and its asymptotic normality is established with achieving an optimal convergence...
Show moreVolatility is usually employed to measure the dispersion of asset returns, and it’s widely used in risk analysis and asset management. This first chapter studies a kernelbased spot volatility matrix estimator with preaveraging approach for highfrequency data contaminated by market microstructure noise. When the sample size goes to infinity and the bandwidth vanishes, we show that our estimator is consistent and its asymptotic normality is established with achieving an optimal convergence rate. We also construct a consistent pairwise spot covolatility estimator with HayashiYoshida method for nonsynchronous highfrequency data with noise contamination. The simulation studies demonstrate that the proposed estimators work well under different noise levels, and their estimation performances are improved by the increasing sample frequency. In empirical applications, we implement the estimators on the intraday prices of four component stocks of Dow Jones Industrial Average. The second chapter shows a factorbased vast volatility matrix estimation method for high frequency financial data with market microstructure noise, finite large jumps and infinite activity small jumps. We construct the sample volatility matrix estimator based on the approximate factor model, and use the preaveraging and thresholding estimation method (PATH) to digest the noise and jumps. After using the principle component analysis (PCA) to decompose the sample volatility matrix estimator, our proposed volatility matrix estimator is finally obtained by imposing the blockdiagonal regularization on the residual covariance matrix through sorting the assets with the global industry classification standard (GICS) codes. The Monte Carlo simulation shows that our proposed volatility matrix estimator can remove the majority effects of noise and jumps, and its estimation performance improves fast when the sample frequency increases. Finally, the PCAbased estimators are employed to perform volatility matrix estimation and asset allocation for S&P 500 stocks. To compare with PCAbased estimators, we also include the exchangetraded funds (ETFs) data to construct observable factors such as the FamaFrench factors for volatility estimation.
Show less  Date Issued
 2018
 Identifier
 2018_Sp_Xue_fsu_0071E_14471
 Format
 Thesis
 Title
 Statistical Shape Analysis of Neuronal Tree Structures.
 Creator

Duncan, Adam, Srivastava, Anuj, Klassen, E., Wu, Wei, Huffer, Fred W., Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

Neuron morphology plays a central role in characterizing cognitive health and functionality of brain structures. The problem of quantifying neuron shapes, and capturing statistical variability of shapes, is difficult because axons and dendrites have tree structures that differ in both geometry and topology. In this work, we restrict to the trees that consist of: (1) a main branch viewed as a parameterized curve in ℝ³, and (2) some number of secondary branches  also parameterized curves in...
Show moreNeuron morphology plays a central role in characterizing cognitive health and functionality of brain structures. The problem of quantifying neuron shapes, and capturing statistical variability of shapes, is difficult because axons and dendrites have tree structures that differ in both geometry and topology. In this work, we restrict to the trees that consist of: (1) a main branch viewed as a parameterized curve in ℝ³, and (2) some number of secondary branches  also parameterized curves in ℝ³  which emanate from the main branch at arbitrary points. We present two shapeanalytic frameworks which each give a metric structure to the set of such tree shapes, Both frameworks are based on an elastic metric on the space of curves with certain shapepreserving nuisance variables modded out. In the first framework, the side branches are treated as a continuum of curvevalued annotations to the main branch. In the second framework, the side branches are treated as discrete entities and are matched to each other by permutation. We show geodesic deformations between tree shapes in both frameworks, and we show Fréchet means and modes of variability, as well as crossvalidated classification between different experimental groups using the second framework. We conclude with a smaller project which extends some of these ideas to more general weighted attributed graphs.
Show less  Date Issued
 2018
 Identifier
 2018_Sp_Duncan_fsu_0071E_14500
 Format
 Thesis
 Title
 Critical Issues in Survey MetaAnalysis.
 Creator

Gozutok, Ahmet Serhat, Becker, Betsy Jane, Huffer, Fred W., Yang, Yanyun, Paek, Insu, Florida State University, College of Education, Department of Educational Psychology and...
Show moreGozutok, Ahmet Serhat, Becker, Betsy Jane, Huffer, Fred W., Yang, Yanyun, Paek, Insu, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

In research synthesis, researchers may aim at summarizing peoples' attitudes and perceptions of phenomena that have been assessed using different measures. Selfreport rating scales are among the most commonly used measurement tools to quantify such latent constructs in education and psychology. However, selfreport ratingscale questions measuring the same construct may differ from each other in many ways. Scale format, number of response options, wording of questions, and labeling of...
Show moreIn research synthesis, researchers may aim at summarizing peoples' attitudes and perceptions of phenomena that have been assessed using different measures. Selfreport rating scales are among the most commonly used measurement tools to quantify such latent constructs in education and psychology. However, selfreport ratingscale questions measuring the same construct may differ from each other in many ways. Scale format, number of response options, wording of questions, and labeling of response option categories may vary across questions. Consequently, variations across the measures of the same construct bring about the issue of comparability of the results across the studies in metaanalytic investigations. In this study, I examine the complexities of summarizing the results of different survey questions about the same construct in the metaanalytic fashion. More specifically, this study focuses on the practical problems that arise when combining survey items that differ from one another in the wording of question stems, numbers of response option categories, scale direction (i.e., unipolar and bipolar scales), response scale labeling (i.e., fullylabeled scales and endpointslabeled scales), and responseoption labeling (e.g., "extremely happy"  "completely happy"  "most happy", "pretty happy", "quite happy" "moderately happy", and "not at all happy"  "least happy"  "most unhappy"). In addition, I propose practical solutions to handle the issues that arise due to such variations when conducting a metaanalysis. I discuss the implications of the proposed solutions from the perspective of metaanalysis. Examples are obtained from the collection of studies in the World Happiness Database (Veenhoven, 2006), which includes various singleitem happiness measures.
Show less  Date Issued
 2018
 Identifier
 2018_Fall_Gozutok_fsu_0071E_14866
 Format
 Thesis
 Title
 A Stock Market AgentBased Model Using Evolutionary Game Theory and Quantum Mechanical Formalism.
 Creator

Montin, Benoit S., Nolder, Craig A., Huﬀer, Fred W., Case, Bettye Anne, Beaumont, Paul M., Kercheval, Alec N., Sumners, DeWitt L., Department of Mathematics, Florida State...
Show moreMontin, Benoit S., Nolder, Craig A., Huﬀer, Fred W., Case, Bettye Anne, Beaumont, Paul M., Kercheval, Alec N., Sumners, DeWitt L., Department of Mathematics, Florida State University
Show less  Abstract/Description

The financial market is modelled as a complex selforganizing system. Three economic agents interact in a simplified economy and seek the maximization of their wealth. Replicator dynamics are used as a myopic behavioral rule to describe how agents learn and benefit from their experiences. Stock price fluctuations result from interactions between economic agents, budget constraints and conservation laws. Time is discrete. Invariant distributions over the state space, that is to say probability...
Show moreThe financial market is modelled as a complex selforganizing system. Three economic agents interact in a simplified economy and seek the maximization of their wealth. Replicator dynamics are used as a myopic behavioral rule to describe how agents learn and benefit from their experiences. Stock price fluctuations result from interactions between economic agents, budget constraints and conservation laws. Time is discrete. Invariant distributions over the state space, that is to say probability measures that remain unchanged by the oneperiod transition rule, form stochastic equilibria for our composite system. When agents make mistakes, there is a unique stochastic steady state which reflects the average and limit behavior. Convergence of the iterates occurs at a geometric rate in the total variation norm. Interestingly, when the probability of making a mistake tends to zero, the invariant distribution converges weakly to a stochastic equilibrium for the model without mistakes. Most agentbased computational economies heavily rely on simulations. Having adopted a simple representation of financial markets, we have been able to prove the above theoretical results and gain intuition on complexity economics. The impact of simple monetary policies on the limit stock price distribution, such as a decrease of the riskfree rate of interest, has been analyzed. Of interest as well, the limit stock log return distribution presents realworld features (skewed and leptokurtic) that more traditional models usually fail to explain or consider. Our artificial market is incomplete. The bid and ask prices of a vanilla Call option have been computed to illustrate option pricing in our setting.
Show less  Date Issued
 2004
 Identifier
 FSU_migr_etd2331
 Format
 Thesis