Current Search: Huffer, Fred W. Fred William (x)
Search results
 Title
 Predictive Accuracy Measures for Binary Outcomes: Impact of Incidence Rate and Optimization Techniques.
 Creator

Scolnik, Ryan, McGee, Daniel, Slate, Elizabeth H., Eberstein, Isaac W., Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences, Department of...
Show moreScolnik, Ryan, McGee, Daniel, Slate, Elizabeth H., Eberstein, Isaac W., Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences, Department of Statistics
Show less  Abstract/Description

Evaluating the performance of models predicting a binary outcome can be done using a variety of measures. While some measures intend to describe the model's overall fit, others more accurately describe the model's ability to discriminate between the two outcomes. If a model fits well but doesn't discriminate well, what does that tell us? Given two models, if one discriminates well but has poor fit while the other fits well but discriminates poorly, which of the two should we choose? The...
Show moreEvaluating the performance of models predicting a binary outcome can be done using a variety of measures. While some measures intend to describe the model's overall fit, others more accurately describe the model's ability to discriminate between the two outcomes. If a model fits well but doesn't discriminate well, what does that tell us? Given two models, if one discriminates well but has poor fit while the other fits well but discriminates poorly, which of the two should we choose? The measures of interest for our research include the area under the ROC curve, Brier Score, discrimination slope, LogLoss, Rsquared and Fscore. To examine the underlying relationships among all of the measures, real data and simulation studies are used. The real data comes from multiple cardiovascular research studies and the simulation studies are run under general conditions and also for incidence rates ranging from 2% to 50%. The results of these analyses provide insight into the relationships among the measures and raise concern for scenarios when the measures may yield different conclusions. The impact of incidence rate on the relationships provides a basis for exploring alternative maximization routines to logistic regression. While most of the measures are easily optimized using the NewtonRaphson algorithm, the maximization of the area under the ROC curve requires optimization of a nonlinear, nondifferentiable function. Usage of the NelderMead simplex algorithm and close connections to economics research yield unique parameter estimates and general asymptotic conditions. Using real and simulated data to compare optimizing the area under the ROC curve to logistic regression further reveals the impact of incidence rate on the relationships, significant increases in achievable areas under the ROC curve, and differences in conclusions about including a variable in a model.
Show less  Date Issued
 2016
 Identifier
 FSU_2016SP_Scolnik_fsu_0071E_13146
 Format
 Thesis
 Title
 Combining Regression Slopes from Studies with Different Models in MetaAnalysis.
 Creator

Jeon, Sanghyun, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Yang, Yanyun, Paek, Insu, Florida State University, College of Education, Department of Educational...
Show moreJeon, Sanghyun, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Yang, Yanyun, Paek, Insu, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

Primary studies are using complex models more and more. Slopes from multiple regression analyses are reported in primary studies, but few scholars have dealt with how to combine multiple regression slopes. One of the problems in combining multiple regression slopes is that each study may use a different regression model. The purpose of this research is to propose a method for combining partial regression slopes from studies with different regression models. The method combines comparable...
Show morePrimary studies are using complex models more and more. Slopes from multiple regression analyses are reported in primary studies, but few scholars have dealt with how to combine multiple regression slopes. One of the problems in combining multiple regression slopes is that each study may use a different regression model. The purpose of this research is to propose a method for combining partial regression slopes from studies with different regression models. The method combines comparable covariance matrices to obtain a synthetic partial slope. The proposed method assumes the population is homogeneous, and that the different regression models are nested. Elements in the sample covariance matrix are not independent of each other, so missing elements should be imputed using conditional expectations. The Bartlett decomposition is used to decompose the sample covariance matrix into a parameter component and a sampling error component. The proposed method treats the samplesize weighted average as a parameter matrix and applies Bartlett’s decomposition to the sample covariance matrices to get their respective error matrices. Since missing elements in the error matrix are not correlated, missing elements can be estimated in the error matrices and hence in the parameter matrices. Finally the partial slopes can be computed from the combined matrices. Simulation shows the suggested method gives smaller standard errors than the listwisedeletion method and the pairwisedeletion method. An empirical examination shows the suggested method can be applied to heterogeneous populations.
Show less  Date Issued
 2017
 Identifier
 FSU_FALL2017_Jeon_fsu_0071E_14179
 Format
 Thesis
 Title
 Examining the Effect of Treatment on the Distribution of Blood Pressure in the Population Using Observational Data.
 Creator

Kucukemiroglu, Saryet Alexa, McGee, Daniel, Slate, Elizabeth H., Hurt, Myra M., Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences,...
Show moreKucukemiroglu, Saryet Alexa, McGee, Daniel, Slate, Elizabeth H., Hurt, Myra M., Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences, Department of Statistics
Show less  Abstract/Description

Since the introduction of antihypertensive medications in the mid1950s, there has been an increased use of blood pressure medications in the US. The growing use of antihypertensive treatment has affected the distribution of blood pressure in the population over time. Now observational data no longer reflect natural blood pressure levels. Our goal is to examine the effect of antihypertensive drugs on distributions of blood pressure using several wellknown observational studies. The...
Show moreSince the introduction of antihypertensive medications in the mid1950s, there has been an increased use of blood pressure medications in the US. The growing use of antihypertensive treatment has affected the distribution of blood pressure in the population over time. Now observational data no longer reflect natural blood pressure levels. Our goal is to examine the effect of antihypertensive drugs on distributions of blood pressure using several wellknown observational studies. The statistical concept of censoring is used to estimate the distribution of blood pressure in populations if no treatment were available. The treated and estimated untreated distributions are then compared to determine the general effect of these medications in the population. Our analyses show that these drugs have an increasing impact on controlling blood pressure distributions in populations that are heavily treated.
Show less  Date Issued
 2017
 Identifier
 FSU_FALL2017_Kucukemiroglu_fsu_0071E_14275
 Format
 Thesis
 Title
 SemiParametric Generalized Estimating Equations with Kernel Smoother: A Longitudinal Study in Financial Data Analysis.
 Creator

Yang, Liu, Niu, Xufeng, Cheng, Yingmei, Huffer, Fred W. (Fred William), Tao, Minjing, Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

Longitudinal studies are widely used in various fields, such as public health, clinic trials and financial data analysis. A major challenge for longitudinal studies is repeated measurements from each subject, which cause time dependent correlation within subjects. Generalized Estimating Equations can deal with correlated outcomes for longitudinal data through marginal effect. My model will base on Generalized Estimating Equations with semiparametric approach, providing a flexible structure...
Show moreLongitudinal studies are widely used in various fields, such as public health, clinic trials and financial data analysis. A major challenge for longitudinal studies is repeated measurements from each subject, which cause time dependent correlation within subjects. Generalized Estimating Equations can deal with correlated outcomes for longitudinal data through marginal effect. My model will base on Generalized Estimating Equations with semiparametric approach, providing a flexible structure for regression models: coefficients for parametric covariates will be estimated and nuisance covariates will be fitted in kernel smoothers for nonparametric part. Profile kernel estimator and the seemingly unrelated kernel estimator (SUR) will be used to deliver consistent and efficient semiparametric estimators comparing to parametric models. We provide simulation results for estimating semiparametric models with one or multiple nonparametric terms. In application part, we would like to focus on financial market: a credit card loan data will be used with the payment information for each customer across 6 months, investigating whether gender, income, age or other factors will influence payment status significantly. Furthermore, we propose model comparisons to evaluate whether our model should be fitted based on different levels of factors, such as male and female or based on different types of estimating methods, such as parametric estimation or semiparametric estimation.
Show less  Date Issued
 2017
 Identifier
 FSU_FALL2017_YANG_fsu_0071E_14219
 Format
 Thesis
 Title
 Bayesian Modeling and Variable Selection for Complex Data.
 Creator

Li, Hanning, Pati, Debdeep, Huffer, Fred W. (Fred William), Kercheval, Alec N., Sinha, Debajyoti, Bradley, Jonathan R., Florida State University, College of Arts and Sciences,...
Show moreLi, Hanning, Pati, Debdeep, Huffer, Fred W. (Fred William), Kercheval, Alec N., Sinha, Debajyoti, Bradley, Jonathan R., Florida State University, College of Arts and Sciences, Department of Statistics
Show less  Abstract/Description

As we routinely encounter highthroughput datasets in complex biological and environment research, developing novel models and methods for variable selection has received widespread attention. In this dissertation, we addressed a few key challenges in Bayesian modeling and variable selection for highdimensional data with complex spatial structures. a) Most Bayesian variable selection methods are restricted to mixture priors having separate components for characterizing the signal and the...
Show moreAs we routinely encounter highthroughput datasets in complex biological and environment research, developing novel models and methods for variable selection has received widespread attention. In this dissertation, we addressed a few key challenges in Bayesian modeling and variable selection for highdimensional data with complex spatial structures. a) Most Bayesian variable selection methods are restricted to mixture priors having separate components for characterizing the signal and the noise. However, such priors encounter computational issues in high dimensions. This has motivated continuous shrinkage priors, resembling the twocomponent priors facilitating computation and interpretability. While such priors are widely used for estimating highdimensional sparse vectors, selecting a subset of variables remains a daunting task. b) Spatial/spatialtemporal data sets with complex structures are nowadays commonly encountered in various scientific research fields ranging from atmospheric sciences, forestry, environmental science, biological science, and social science. Selecting important spatial variables that have significant influences on occurrences of events is undoubtedly necessary and essential for providing insights to researchers. Selfexcitation, which is a feature that occurrence of an event increases the likelihood of more occurrences of the same type of events nearby in time and space, can be found in many natural/social events. Research on modeling data with selfexcitation feature has increasingly drawn interests recently. However, existing literature on selfexciting models with inclusion of highdimensional spatial covariates is still underdeveloped. c) Gaussian Process is among the most powerful model frames for spatial data. Its major bottleneck is the computational complexity which stems from inversion of dense matrices associated with a Gaussian process covariance. Hierarchical divideconquer Gaussian Process models have been investigated for ultra large data sets. However, computation associated with scaling the distributing computing algorithm to handle a large number of subgroups poses a serious bottleneck. In chapter 2 of this dissertation, we propose a general approach for variable selection with shrinkage priors. The presence of very few tuning parameters makes our method attractive in comparison to ad hoc thresholding approaches. The applicability of the approach is not limited to continuous shrinkage priors, but can be used along with any shrinkage prior. Theoretical properties for nearcollinear design matrices are investigated and the method is shown to have good performance in a wide range of synthetic data examples and in a real data example on selecting genes affecting survival due to lymphoma. In Chapter 3 of this dissertation, we propose a new selfexciting model that allows the inclusion of spatial covariates. We develop algorithms which are effective in obtaining accurate estimation and variable selection results in a variety of synthetic data examples. Our proposed model is applied on Chicago crime data where the influence of various spatial features is investigated. In Chapter 4, we focus on a hierarchical Gaussian Process regression model for ultrahigh dimensional spatial datasets. By evaluating the latent Gaussian process on a regular grid, we propose an efficient computational algorithm through circulant embedding. The latent Gaussian process borrows information across multiple subgroups, thereby obtaining a more accurate prediction. The hierarchical model and our proposed algorithm are studied through simulation examples.
Show less  Date Issued
 2017
 Identifier
 FSU_FALL2017_Li_fsu_0071E_14159
 Format
 Thesis
 Title
 The Impact of Unbalanced Designs on the Performance of Parametric and Nonparametric DIF Procedures: A Comparison of Mantel Haenszel, Logistic Regression, SIBTEST, and IRTLR Procedures.
 Creator

Alghamdi, Abdullah Ahmed, Paek, Insu, Huffer, Fred W. (Fred William), Becker, Betsy Jane, Yang, Yanyun, Florida State University, College of Education, Department of Educational...
Show moreAlghamdi, Abdullah Ahmed, Paek, Insu, Huffer, Fred W. (Fred William), Becker, Betsy Jane, Yang, Yanyun, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

The current study examined the impact of unbalanced sample sizes between focal and reference groups on the Type I error rates and DIF detection rates (power) of five DIF procedures (MH, LR, general IRTLR, IRTLRb, and SIBTEST). Five simulation factors were used in this study. Four factors were for generating simulation data and they were sample size, DIF magnitude, group mean ability difference (impact), and the studied item difficulty. The fifth factor was the DIF method factor that included...
Show moreThe current study examined the impact of unbalanced sample sizes between focal and reference groups on the Type I error rates and DIF detection rates (power) of five DIF procedures (MH, LR, general IRTLR, IRTLRb, and SIBTEST). Five simulation factors were used in this study. Four factors were for generating simulation data and they were sample size, DIF magnitude, group mean ability difference (impact), and the studied item difficulty. The fifth factor was the DIF method factor that included MH, LR, general IRTLR, IRTLRb, and SIBTEST. A repeatedmeasures ANOVA, where the DIF method factor was the withinsubjects variable, was performed to compare the performance of the five DIF procedures and to discover their interactions with other factors. For each data generation condition, 200 replications were made. Type I error rates for MH and IRTLR DIF procedures were close to or lower than 5%, the nominal level for different sample size levels. On average, the Type I error rates for IRTLRb and SIBTEST were 5.7%, and 6.4%, respectively. In contrast, the LR DIF procedure seems to have a higher Type I error rate, which ranged from 5.3% to 8.1% with 6.9% on average. When it comes to the rejection rate under DIF conditions, or the DIF detection rate, the IRTLRb showed the highest DIF detection rate followed by SIBTEST with averages of 71.8% and 68.4%, respectively. Overall, the impact of unbalanced sample sizes between reference and focal groups on the performance of DIF detection showed a similar tendency for all methods, generally increasing DIF detection rates as the total sample size increased. In practice, IRTLRb, which showed the best performance for DIF detection rates and controlled for the Type I error rates, should be the choice when the modeldata fit is reasonable. If other nonIRT DIF methods are considered, MH or SIBTEST could be used, depending on which type of error (Type I or II) is more seriously considered.
Show less  Date Issued
 2017
 Identifier
 FSU_FALL2017_Alghamdi_fsu_0071E_14180
 Format
 Thesis
 Title
 The Impact of Rater Variability on Relationships among Different EffectSize Indices for InterRater Agreement between Human and Automated Essay Scoring.
 Creator

Yun, Jiyeo, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Paek, Insu, Zhang, Qian, Florida State University, College of Education, Department of Educational Psychology and...
Show moreYun, Jiyeo, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Paek, Insu, Zhang, Qian, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

Since researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for interrater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for interrater agreement used to assess the relatedness of human and automated essay scoring, and to examine impacts of rater variability on interrater agreement. To implement...
Show moreSince researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for interrater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for interrater agreement used to assess the relatedness of human and automated essay scoring, and to examine impacts of rater variability on interrater agreement. To implement the investigations, my study consists of two parts: empirical and simulation studies. Based on the results from the empirical study, the overall effects for interrater agreement were .63 and .99 for exact and adjacent proportions of agreement, .48 for kappas, and between .75 and .78 for correlations. Additionally, significant differences between 6point scales and the other scales (i.e., 3, 4, and 5point scales) for correlations, kappas and proportions of agreement existed. Moreover, based on the results of the simulated data, the highest agreements and lowest discrepancies achieved in the matched rater distribution pairs. Specifically, the means of exact and adjacent proportions of agreement, kappa and weighted kappa values, and correlations were .58, .95, .42, .78, and .78, respectively. Meanwhile the average standardized mean difference was .0005 in the matched rater distribution pairs. Acceptable values for interrater agreement as evaluation criteria for automated essay scoring, impacts of rater variability on interrater agreement, and relationships among interrater agreement indices were discussed.
Show less  Date Issued
 2017
 Identifier
 FSU_FALL2017_Yun_fsu_0071E_14144
 Format
 Thesis
 Title
 Spatial Statistics and Its Applications in Biostatistics and Environmental Statistics.
 Creator

Hu, Guanyu, Huffer, Fred W. (Fred William), Paek, Insu, Sinha, Debajyoti, Slate, Elizabeth H., Bradley, Jonathan R., Florida State University, College of Arts and Sciences,...
Show moreHu, Guanyu, Huffer, Fred W. (Fred William), Paek, Insu, Sinha, Debajyoti, Slate, Elizabeth H., Bradley, Jonathan R., Florida State University, College of Arts and Sciences, Department of Statistics
Show less  Abstract/Description

This dissertation presents some topics in spatial statistics and their application in biostatistics and environmental statistics. The field of spatial statistics is an energetic area in statistics. In Chapter 2 and Chapter 3, the goal is to build subregion models under the assumption that the responses or the parameters are spatially correlated. For regression models, considering spatially varying coecients is a reasonable way to build subregion models. There are two different techniques for...
Show moreThis dissertation presents some topics in spatial statistics and their application in biostatistics and environmental statistics. The field of spatial statistics is an energetic area in statistics. In Chapter 2 and Chapter 3, the goal is to build subregion models under the assumption that the responses or the parameters are spatially correlated. For regression models, considering spatially varying coecients is a reasonable way to build subregion models. There are two different techniques for exploring spatially varying coecients. One is geographically weighted regression (Brunsdon et al. 1998). The other is a spatially varying coecients model which assumes a stationary Gaussian process for the regression coecients (Gelfand et al. 2003). Based on the ideas of these two techniques, we introduce techniques for exploring subregion models in survival analysis which is an important area of biostatistics. In Chapter 2, we introduce modied versions of the KaplanMeier and NelsonAalen estimators which incorporate geographical weighting. We use ideas from counting process theory to obtain these modied estimators, to derive variance estimates, and to develop associated hypothesis tests. In Chapter 3, we introduce a Bayesian parametric accelerated failure time model with spatially varying coefficients. These two techniques can explore subregion models in survival analysis using both nonparametric and parametric approaches. In Chapter 4, we introduce Bayesian parametric covariance regression analysis for a response vector. The proposed method denes a regression model between the covariance matrix of a pdimensional response vector and auxiliary variables. We propose a constrained MetropolisHastings algorithm to get the estimates. Simulation results are presented to show performance of both regression and covariance matrix estimates. Furthermore, we have a more realistic simulation experiment in which our Bayesian approach has better performance than the MLE. Finally, we illustrate the usefulness of our model by applying it to the Google Flu data. In Chapter 5, we give a brief summary of future work.
Show less  Date Issued
 2017
 Identifier
 FSU_FALL2017_Hu_fsu_0071E_14205
 Format
 Thesis
 Title
 Median Regression for Complex Survey Data.
 Creator

Fraser, Raphael André, Sinha, Debajyoti, Lipsitz, Stuart, Carlson, Elwood, Slate, Elizabeth H., Huffer, Fred W. (Fred William), Florida State University, College of Arts and...
Show moreFraser, Raphael André, Sinha, Debajyoti, Lipsitz, Stuart, Carlson, Elwood, Slate, Elizabeth H., Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences, Department of Statistics
Show less  Abstract/Description

The ready availability of publicuse data from various large national complex surveys has immense potential for the assessment of population characteristicsmeans, proportions, totals, etcetera. Using a modelbased approach, complex surveys can be used to evaluate the effectiveness of treatments and to identify risk factors for important diseases such as cancer. Existing statistical methods based on estimating equations and/or utilizing resampling methods are often not valid with survey data...
Show moreThe ready availability of publicuse data from various large national complex surveys has immense potential for the assessment of population characteristicsmeans, proportions, totals, etcetera. Using a modelbased approach, complex surveys can be used to evaluate the effectiveness of treatments and to identify risk factors for important diseases such as cancer. Existing statistical methods based on estimating equations and/or utilizing resampling methods are often not valid with survey data due to design features such as stratification, multistage sampling and unequal selection probabilities. In this paper, we accommodate these design features in the analysis of highly skewed response variables arising from large complex surveys. Specifically, we propose a doubletransformbothsides based estimating equations approach to estimate the median regression parameters of the highly skewed response; the doubletransformbothsides method applies the same transformation twice to both the response and regression function. The usual sandwich variance estimate can be used in our approach, whereas a resampling approach would be needed for a pseudolikelihood based on minimizing absolute deviations. Furthermore, the doubletransformbothsides estimator is relatively robust to the true underlying distribution, and has much smaller mean square error than the least absolute deviations estimator. The method is motivated by an analysis of laboratory data on urinary iodine concentration from the National Health and Nutrition Examination Survey.
Show less  Date Issued
 2015
 Identifier
 FSU_2015fall_Fraser_fsu_0071E_12825
 Format
 Thesis
 Title
 Four Methods for Combining Dependent Effects from Studies Reporting Regression Analysis.
 Creator

Gunter, Tracey Danielle, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Almond, Russell G., Paek, Insu, Florida State University, College of Education, Department of...
Show moreGunter, Tracey Danielle, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Almond, Russell G., Paek, Insu, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

Over the years a variety of indices have been proposed to summarize regression analyses. Unfortunately the proposed indices are only appropriate when metaanalysts want to understand the role of a single predictor variable in predicting the outcome variable. However, sometimes metaanalysts want to understand the effect of a set of variables on an outcome variable. In this paper, four methods are presented for obtaining a composite effect for two focal predictor variables from a single...
Show moreOver the years a variety of indices have been proposed to summarize regression analyses. Unfortunately the proposed indices are only appropriate when metaanalysts want to understand the role of a single predictor variable in predicting the outcome variable. However, sometimes metaanalysts want to understand the effect of a set of variables on an outcome variable. In this paper, four methods are presented for obtaining a composite effect for two focal predictor variables from a single regression model. The indices are the average of the standardized regression coefficients (ASC), the average of the standardized regression coefficients using Hedges and Olkin's (1985) approach (AHO), the sheaf coefficient (SC), and the squared multiple semipartial correlation coefficient (MSP). A simulation study was conducted to examine the behavior of the indices and their variance when the number of predictor variables in the model, the sample size, the correlations between the focal predictor variables in the model, and the correlations between the focal and nonfocal predictor variables in the model were manipulated. The results of the study show that the average bias values of the ASC and AHO estimates are small even when the sample size is small. Furthermore, the ASC and AHO estimates and their estimated variances are more precise than the other indices under all conditions examined. Therefore, when metaanalysts are interested in estimating the effect of a set of predictor variables on an outcome variable from a single regression model, the ASC or AHO procedures are preferred.
Show less  Date Issued
 2015
 Identifier
 FSU_2015fall_Gunter_fsu_0071E_12829
 Format
 Thesis
 Title
 Sorvali Dilatation and Spin Divisors on Riemann and Klein Surfaces.
 Creator

Almalki, Yahya Ahmed, Nolder, Craig, Huffer, Fred W. (Fred William), Klassen, E. (Eric), Klassen, E. (Eric), van Hoeij, Mark, Florida State University, College of Arts and...
Show moreAlmalki, Yahya Ahmed, Nolder, Craig, Huffer, Fred W. (Fred William), Klassen, E. (Eric), Klassen, E. (Eric), van Hoeij, Mark, Florida State University, College of Arts and Sciences, Department of Mathematics
Show less  Abstract/Description

We review the Sorvali dilatation of isomorphisms of covering groups of Riemann surfaces and extend the definition to groups containing glidereflections. Then we give a bound for the distance between two surfaces, one of them resulting from twisting the other at a decomposing curve. Furthermore, we study spin structures on Riemann and Klein surfaces in terms of divisors. In particular, we take a closer look at spin structures on hyperelliptic and pgonal surfaces defined by divisors supported...
Show moreWe review the Sorvali dilatation of isomorphisms of covering groups of Riemann surfaces and extend the definition to groups containing glidereflections. Then we give a bound for the distance between two surfaces, one of them resulting from twisting the other at a decomposing curve. Furthermore, we study spin structures on Riemann and Klein surfaces in terms of divisors. In particular, we take a closer look at spin structures on hyperelliptic and pgonal surfaces defined by divisors supported on branch points. Moreover, we study invariant spin divisors under automorphisms and antiholomorphic involutions of Riemann surfaces.
Show less  Date Issued
 2017
 Identifier
 FSU_SUMMER2017_ALMALKI_fsu_0071E_14064
 Format
 Thesis
 Title
 Improvement of Quality Prediction in InterConnected Manufacturing System by Integrating MultiSource Data.
 Creator

Ren, Jie, Wang, Hui, Vanli, Omer Arda, Park, Chiwoo, Huffer, Fred W. (Fred William), Florida State University, FAMUFSU College of Engineering, Department of Industrial and...
Show moreRen, Jie, Wang, Hui, Vanli, Omer Arda, Park, Chiwoo, Huffer, Fred W. (Fred William), Florida State University, FAMUFSU College of Engineering, Department of Industrial and Manufacturing Engineering
Show less  Abstract/Description

With the development of advanced sensing and network technology such as wireless data transmission and data storage and analytics under cloud platforms, the manufacturing plant is going through a new revolution, by which different production units/components can communicate with each other, leading to interconnected manufacturing. The interconnection enables the close coordination of process control actions among machines to improve product quality. Traditional quality prediction methods...
Show moreWith the development of advanced sensing and network technology such as wireless data transmission and data storage and analytics under cloud platforms, the manufacturing plant is going through a new revolution, by which different production units/components can communicate with each other, leading to interconnected manufacturing. The interconnection enables the close coordination of process control actions among machines to improve product quality. Traditional quality prediction methods that focus on the data from one single source are not sufficient to deal with the variation modeling, and quality prediction problems involved the interconnected manufacturing. Instead, new quality prediction methods that can integrate the data from multiple sources are necessary. This research addresses the fundamental challenges in improving quality prediction by data fusion for interconnected manufacturing including knowledge sharing and transfer among different machines and collaboration error monitoring. The methodology is demonstrated through surface machining and additive manufacturing processes. The first study is on the surface quality prediction for one machining process by fusing multiresolution spatial data measured from multiple surfaces or different surface machining processes. The surface variation is decomposed into a global trend part that characterizes the spatially varying relationship of selected process variables and surface height and a zeromean spatial Gaussian process part. Three models including two varying coefficientbased spatial models and an inference rulebased spatial model are proposed and compared. Also, transfer learning technique is used to help train the model via transferring useful information from a datarich surface to a datalacking surface, which demonstrates the advantage of interconnected manufacturing. The second study deals with the surface mating errors caused by the surface variations from two interconnected surface machining processes. A model aggregating data from two surfaces is proposed to predict the leak areas for surface assembly. By using the measurements of leak areas and the profiles of surfaces mated as training data along with Hagen–Poiseuille law, this study develops a novel diagnostic method to predict potential leak areas (leakage paths). The effectiveness and robustness of the proposed method are verified by an experiment and a simulation study. The approach provides practical guidance for the subsequent assembly process as well as troubleshooting in manufacturing processes. The last study focuses on the learning of quality prediction model in interconnected additive manufacturing systems, by which different 3D printing processes involved are driven by similar printing mechanisms and can exchange quality data via a network. A quality prediction model that estimates the printing widths along the printing paths for materialextrusionbased additive manufacturing (a.k.a., fused filament fabrication or fused deposition modeling) is established by leveraging the betweenprinter quality data. The established mathematical model quantifies the printing linewidth along the printing paths based on the kinematic parameters, e.g., printing speed and acceleration while considering data from multiple printers that contain betweenmachines similarity. The method can allow for the betweenprinter knowledge sharing to improve the quality prediction so that a printing process with limited historical data can quickly learn an effective quality model without intensive retraining, thus improving the system responsiveness to product variety. In the long run, the outcome of this research can help contribute to the development of highefficient InternetofThings manufacturing services for personalized products.
Show less  Date Issued
 2019
 Identifier
 2019_Spring_Ren_fsu_0071E_15160
 Format
 Thesis
 Title
 Marked Determinantal Point Processes.
 Creator

Feng, Yiming, Nolder, Craig, Niu, Xufeng, Bradley, Jonathan R., Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

Determinantal point processes (DPPs), which can be dened by their correlation kernels with known moments, are useful models for point patterns where nearby points exhibit repulsion. They have many nice properties, such as closedform densities, tractable estimation of parameterized families, and no edge eects. In the past, univariate DPPs have been wellstudied, both in discrete and continuous settings although their statistical applications are fairly recent and still rather limited, whereas...
Show moreDeterminantal point processes (DPPs), which can be dened by their correlation kernels with known moments, are useful models for point patterns where nearby points exhibit repulsion. They have many nice properties, such as closedform densities, tractable estimation of parameterized families, and no edge eects. In the past, univariate DPPs have been wellstudied, both in discrete and continuous settings although their statistical applications are fairly recent and still rather limited, whereas the multivariate DPPs, or the socalled multitype marked DPPs, have been little explored. In this thesis, we propose a class of multivariate DPPs based on a block kernel construction. For the marked DPP, we show that the conditions of existence of DPP can easily be satised. The block construction allows us to model the individually marked DPPs as well as controlling the scale of repulsion of points having dierent marks. Unlike other researchers who model the kernel function of a DPP, we model its spectral representation, which not only guarantees the existence of the multivariate DPP, but makes the simulationbased estimation methods readily available. In our research, we adopted bivariate complex Fourier basis, which demonstrates nice properties such as constant intensity and approximate isotropy within a short distance between the nearby points. The parameterized block kernels can approximate to commonlyused covariance functions using Fourier expansion. The parameters can be estimated using Maximum Likelihood Estimation, Bayesian approach and Minimum Contrast Estimation.
Show less  Date Issued
 2019
 Identifier
 2019_Spring_Feng_fsu_0071E_15011
 Format
 Thesis
 Title
 Bayesian Tractography Using Geometric Shape Priors.
 Creator

Dong, Xiaoming, Srivastava, Anuj, Klassen, E. (Eric), Wu, Wei, Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

Diffusionweighted image(DWI) and tractography have been developed for decades and are key elements in recent, largescale efforts for mapping the human brain. The two techniques together provide us a unique possibility to access the macroscopic structure and connectivity of the human brain noninvasively and in vivo. The information obtained not only can help visualize brain connectivity and help segment the brain into different functional areas but also provides tools for understanding some...
Show moreDiffusionweighted image(DWI) and tractography have been developed for decades and are key elements in recent, largescale efforts for mapping the human brain. The two techniques together provide us a unique possibility to access the macroscopic structure and connectivity of the human brain noninvasively and in vivo. The information obtained not only can help visualize brain connectivity and help segment the brain into different functional areas but also provides tools for understanding some major cognitive diseases such as multiple sclerosis, schizophrenia, epilepsy, etc. There are lots of efforts have been put into this area. On the one hand, a vast spectrum of tractography algorithms have been developed in recent years, ranging from deterministic approaches through probabilistic methods to global tractography; On the other hand, various mathematical models, such as diffusion tensor, multitensor model, spherical deconvolution, Qball modeling, have been developed to better exploit the acquisition dependent signal of Diffusionweighted image(DWI). Despite considerable progress in this area, current methods still face many challenges, such as sensitive to noise, lots of false positive/negative fibers, incapable of handling complex fiber geometry and expensive computation cost. More importantly, recent researches have shown that, even with highquality data, the results using current tractography methods may not be improved, suggesting that it is unlikely to obtain an anatomically accurate map of the human brain solely based on the diffusion profile. Motivated by these issues, this dissertation develops a global approach that incorporates anatomical validated geometric shape prior when reconstructing neuron fibers. The fiber tracts between regions of interest are initialized and updated via deformations based on gradients of the posterior energy defined in this paper. This energy has contributions from diffusion data, shape prior information, and roughness penalty. The dissertation first describes and demonstrates the proposed method on the 2D dataset and then extends it to 3D Phantom data and the real brain data. The results show that the proposed method is relatively immune to issues such as noise, complicated fiber structure like fiber crossings and kissing, false positive fibers, and achieve more explainable tractography results.
Show less  Date Issued
 2019
 Identifier
 2019_Spring_DONG_fsu_0071E_15144
 Format
 Thesis
 Title
 Envelopes, Subspace Learning and Applications.
 Creator

Wang, Wenjing, Zhang, Xin, Tao, Minjing, Li, Wen, Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

Envelope model is a nascent dimension reduction technique. We focus on extending the envelope methodology to broader applications. In the first part of this thesis we propose a common reducing subspace model that can simultaneously estimating covariance, precision matrices and their differences across multiple populations. This model leads to substantial dimension reduction and efficient parameter estimation. We explicitly quantify the efficiency gain through an asymptotic analysis. In the...
Show moreEnvelope model is a nascent dimension reduction technique. We focus on extending the envelope methodology to broader applications. In the first part of this thesis we propose a common reducing subspace model that can simultaneously estimating covariance, precision matrices and their differences across multiple populations. This model leads to substantial dimension reduction and efficient parameter estimation. We explicitly quantify the efficiency gain through an asymptotic analysis. In the second part, we propose a set of new mixture models called CLEMM (Clustering with Envelope Mixture Models) that is based on the widely used Gaussian mixture model assumptions. The proposed CLEMM framework and the associated envelopeEM algorithms provides the foundations for envelope methodology in unsupervised and semisupervised learning problems. We also illustrate the performance of these models with simulation studies and empirical applications. Also, we have extended the envelope discriminant analysis from vector data to tensor data in the third part of this thesis. Another study on copulabased models for forecasting realized volatility matrix is included, which is an important financial application of estimating covariance matrices. We consider multivariatet, Clayton, and bivariate t, Gumbel, Clayton copulas to model and forecast oneday ahead realized volatility matrices. Empirical results show that copula based models can achieve significant performance both in terms of statistical precision and economical efficiency.
Show less  Date Issued
 2019
 Identifier
 2019_Spring_Wang_fsu_0071E_15085
 Format
 Thesis
 Title
 Impact of Violations of Measurement Invariance in Longitudinal Mediation Modeling.
 Creator

Xu, Jie, Yang, Yanyun, Zhang, Qian, Huffer, Fred W. (Fred William), Becker, Betsy J., Florida State University, College of Education, Department of Educational Psychology and...
Show moreXu, Jie, Yang, Yanyun, Zhang, Qian, Huffer, Fred W. (Fred William), Becker, Betsy J., Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

Research has shown that crosssectional mediation analysis cannot accurately reflect a true longitudinal mediated effect. To investigate longitudinal mediated effects, different longitudinal mediation models have been proposed and these models focus on different research questions related to longitudinal mediation. When fitting mediation models to longitudinal data, the assumption of longitudinal measurement invariance is usually made. However, the consequences of violating this assumption...
Show moreResearch has shown that crosssectional mediation analysis cannot accurately reflect a true longitudinal mediated effect. To investigate longitudinal mediated effects, different longitudinal mediation models have been proposed and these models focus on different research questions related to longitudinal mediation. When fitting mediation models to longitudinal data, the assumption of longitudinal measurement invariance is usually made. However, the consequences of violating this assumption have not been thoroughly studied in mediation analysis. No studies have examined issues of measurement noninvariance in a latent crosslagged panel mediation (LCPM) model with three or more measurement occasions. The goal of the current study is to investigate the impact of violations of measurement invariance on longitudinal mediation analysis. The focal model in the study is the LCPM model suggested by Cole and Maxwell (2003). This model can be used to examine mediated effects among the latent predictor, mediator, and outcome variables across time. In addition, it can account for measurement error and allow for the evaluation of longitudinal measurement invariance. Simulation methods were used and the investigation was performed using population covariance matrices and sample data generated under various conditions. Eight design factors were considered for data generation: sample size, proportion of noninvariant items, position of latent factors with noninvariant items, type of noninvariant parameters, magnitude of noninvariance, pattern of noninvariance, size of the direct effect, and size of the mediated effect. Results from population investigation were evaluated based on overall model fit and the calculated direct and mediated effects; results from finite sample analysis were evaluated in terms of convergence and inadmissible solutions, overall model fit, bias/relative bias, coverage rates, and statistical power/type I error rates. In general, results obtained from finite sample analysis were consistent with those from the population investigation, with respect to both model fit and parameter estimation. The type I error rate of the mediated effects was inflated under the noninvariant conditions with small sample size (200); power of the direct and mediated effects was excellent (1.0 or close to 1.0) across all investigated conditions. Type I error rates based on the chisquare statistic test were seriously inflated under the invariant conditions, especially when the sample size was relatively small. Power for detecting model misspecifications due to longitudinal noninvariance was excellent across all investigated conditions. Fit indices (CFI, TLI, RMSEA, and SRMR) were not sensitive in detecting misspecifications caused by violations of measurement invariance in the investigated LCPM model. Study results also showed that as the magnitude of noninvariance, the proportion of noninvariant items, and the number of positions of latent variables with noninvariant items increased, estimation of the direct and mediated effects tended to be less accurate. The decreasing pattern of change in item parameters over measurement occasions resulted in the least accurate estimates of the direct and mediated effects. Parameter estimates were fairly accurate under the conditions of the decreasing and then increasing pattern and the mixed pattern of change in item parameters. Findings from this study can help empirical researchers better understand the potential impact of violating measurement invariance on longitudinal mediation analysis using the LCPM model.
Show less  Date Issued
 2019
 Identifier
 2019_Spring_Xu_fsu_0071E_14994
 Format
 Thesis
 Title
 Random Walks over Point Processes and Their Application in Finance.
 Creator

Salehy, Seyyed Navid, Kercheval, Alec N., Ewald, Brian, Fahim, Arash, Ökten, Giray, Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences,...
Show moreSalehy, Seyyed Navid, Kercheval, Alec N., Ewald, Brian, Fahim, Arash, Ökten, Giray, Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences, Department of Mathematics
Show less  Abstract/Description

In continuoustime models in finance, it is common to assume that prices follow a geometric Brownian motion. More precisely, it is assumed that the price at time t ≥ 0 is given by Zt = Z₀exp(σBt + mt) where Z₀ is the initial price, B is standard Brownian motion, σ is the volatility, and m is the drift. We discuss how Z can be viewed as the limit of a sequence of discrete price models based on random walks. We note that in the usual random walks, jumps can only happen at deterministic times....
Show moreIn continuoustime models in finance, it is common to assume that prices follow a geometric Brownian motion. More precisely, it is assumed that the price at time t ≥ 0 is given by Zt = Z₀exp(σBt + mt) where Z₀ is the initial price, B is standard Brownian motion, σ is the volatility, and m is the drift. We discuss how Z can be viewed as the limit of a sequence of discrete price models based on random walks. We note that in the usual random walks, jumps can only happen at deterministic times. We first construct a natural simple model for price by considering a random walk in which jumps can happen at random times following a counting process N. We then develop a sequence of discrete price models using random walks over point processes. The limit process gives the new price model: Zt = Z₀exp(σBΛt + mΛt), where Λ is the compensator for the counting process N. We note that if N is a Poisson process with intensity 1, then this model coincides with the geometric Brownian motion model for the price. But this new model provides more flexibility as we can choose N to be many other wellknown counting processes. This includes not only homogeneous and inhomogeneous Poisson processes which have deterministic compensators but also Hawkes processes which have stochastic compensators. We also discuss and prove many properties for the process BΛ. For example, we show that BΛ is a continuous square integrable martingale. Moreover, we discuss when BΛ has uncorrelated increments and when it has independent increments. Moreover, we investigate how the BlackScholes pricing formula will change if the price of the risky asset follows this new model when N is an inhomogeneous Poisson process. We show that the usual BlackScholes formula is obtained when the counting process N is a Poisson process with intensity 1.
Show less  Date Issued
 2019
 Identifier
 2019_Spring_Salehy_fsu_0071E_15152
 Format
 Thesis
 Title
 Univariate and Multivariate Volatility Models for Portfolio Value at Risk.
 Creator

Xiao, Jingyi, Niu, Xufeng, Ökten, Giray, Wu, Wei, Huffer, Fred W. (Fred William), Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

In modern day financial risk management, modeling and forecasting stock return movements via their conditional volatilities, particularly predicting the Value at Risk (VaR), became increasingly more important for a healthy economical environment. In this dissertation, we evaluate and compare two main families of models for the conditional volatilities  GARCH and Stochastic Volatility (SV)  in terms of their VaR prediction performance of 5 major US stock indices. We calculate GARCHtype...
Show moreIn modern day financial risk management, modeling and forecasting stock return movements via their conditional volatilities, particularly predicting the Value at Risk (VaR), became increasingly more important for a healthy economical environment. In this dissertation, we evaluate and compare two main families of models for the conditional volatilities  GARCH and Stochastic Volatility (SV)  in terms of their VaR prediction performance of 5 major US stock indices. We calculate GARCHtype model parameters via Quasi Maximum Likelihood Estimation (QMLE) while for those of SV we employ MCMC with Ancillary Sufficient Interweaving Strategy. We use the forecast volatilities corresponding to each model to predict the VaR of the 5 indices. We test the predictive performances of the estimated models by a twostage backtesting procedure and then compare them via the Lopez loss function. Results of this dissertation indicate that even though it is more computational demanding than GARCHtype models, SV dominates them in forecasting VaR. Since financial volatilities are moving together across assets and markets, it becomes apparent that modeling the volatilities in a multivariate framework of modeling is more appropriate. However, existing studies in the literature do not present compelling evidence for a strong preference between univariate and multivariate models. In this dissertation we also address the problem of forecasting portfolio VaR via multivariate GARCH models versus univariate GARCH models. We construct 3 portfolios with stock returns of 3 major US stock indices, 6 major banks and 6 major technical companies respectively. For each portfolio, we model the portfolio conditional covariances with GARCH, EGARCH and MGARCHBEKK, MGARCHDCC, and GOGARCH models. For each estimated model, the forecast portfolio volatilities are further used to calculate (portfolio) VaR. The ability to capture the portfolio volatilities is evaluated by MAE and RMSE; the VaR prediction performance is tested through a twostage backtesting procedure and compared in terms of the loss function. The results of our study indicate that even though MGARCH models are better in predicting the volatilities of some portfolios, GARCH models could perform as well as their multivariate (and computationally more demanding) counterparts.
Show less  Date Issued
 2019
 Identifier
 2019_Spring_Xiao_fsu_0071E_15172
 Format
 Thesis
 Title
 Parameter Sensitive Feature Selection for Learning on Large Datasets.
 Creator

Gramajo, Gary, Barbu, Adrian G. (Adrian Gheorghe), Piyush, Kumar, Huffer, Fred W. (Fred William), She, Yiyuan, Zhang, Jinfeng, Florida State University, College of Arts and...
Show moreGramajo, Gary, Barbu, Adrian G. (Adrian Gheorghe), Piyush, Kumar, Huffer, Fred W. (Fred William), She, Yiyuan, Zhang, Jinfeng, Florida State University, College of Arts and Sciences, Department of Statistics
Show less  Abstract/Description

Though there are many feature selection methods for learning, they might not scale well to very large datasets, such as those generated in computer vision data. Furthermore, it can be beneficial to capture and model the variability inherent to data such as face detection where a plethora of face poses (i.e. parameters) are possible. We propose a parameter sensitive learning method that can learn effectively on datasets that can be prohibitively large. Our contributions are the following....
Show moreThough there are many feature selection methods for learning, they might not scale well to very large datasets, such as those generated in computer vision data. Furthermore, it can be beneficial to capture and model the variability inherent to data such as face detection where a plethora of face poses (i.e. parameters) are possible. We propose a parameter sensitive learning method that can learn effectively on datasets that can be prohibitively large. Our contributions are the following. First, we propose an efficient feature selection algorithm that optimizes a differentiable loss with sparsity constraints. We note that any differentiable loss can be used and will vary depending on the application. The iterative algorithm alternates parameter updates with tightening the sparsity constraints by gradually removing variables based on the coefficient magnitudes and a schedule. Second, we show how to train a single parameter sensitive classifier that models the wide range of class variability. The sole classifier is important since this reduces the amount of data necessary for training compared to methods where multiple classifiers are trained for each parameter value. Third, we show how to use nonlinear univariate response functions to obtain a nonlinear decision boundary with feature selection; an important characteristic since the separation of classes in real world datasets is very challenging. Fourth, we show it is possible to mine hard negatives with feature selection, though it is more difficult. This is vital in computer vision data where 10^5 training examples can be generated per image. Fifth, we propose an approach to perform face detection using a 3D model on a number of face keypoints. We modify binary face features from the literature (generated using random forests) to fit into our 3D model framework. Experiments on detecting the face keypoints and on face detection using the proposed 3D models and modified face features show that the feature selection dramatically improve performance and come close to the state of the art on two standard datasets for face detection . We also apply our parameter sensitive learning method with feature selection to detect malicious websites, a dataset with approximately 2.4 million websites and 3.3 million features per website. We outperform other batch algorithms and obtain results close to a high performing online algorithm but using far fewer features.
Show less  Date Issued
 2015
 Identifier
 FSU_migr_etd9604
 Format
 Thesis
 Title
 Tools for Statistical Analysis on Shape Spaces of ThreeDimensional Object.
 Creator

Xie, Qian, Srivastava, Anuj, Klassen, E. (Eric), Huffer, Fred W. (Fred William), Wu, Wei, Zhang, Jinfeng, Florida State University, College of Arts and Sciences, Department of...
Show moreXie, Qian, Srivastava, Anuj, Klassen, E. (Eric), Huffer, Fred W. (Fred William), Wu, Wei, Zhang, Jinfeng, Florida State University, College of Arts and Sciences, Department of Statistics
Show less  Abstract/Description

With the increasing popularity of information technology, especially electronic imaging techniques, large amount of high dimensional data such as 3D shapes become pervasive in science, engineering and even people's daily life, in the recent years. Though the data quantity is huge, the extraction of relevant knowledge on those data is still limited. How to understand data in a meaningful way is generally an open problem. The specific challenges include finding adequate mathematical...
Show moreWith the increasing popularity of information technology, especially electronic imaging techniques, large amount of high dimensional data such as 3D shapes become pervasive in science, engineering and even people's daily life, in the recent years. Though the data quantity is huge, the extraction of relevant knowledge on those data is still limited. How to understand data in a meaningful way is generally an open problem. The specific challenges include finding adequate mathematical representations of data and designing proper algorithms to process them. The existing tools for analyzing highdimensional data, including 3D shape data, are found to be insufficient as they usually suffer from many factors, such as misalignments, noise, and clutter. This thesis attempts to develop a framework for processing, analyzing and understanding highdimensional data, especially 3D shapes, by proposing a set of statistical tools including theory, algorithms and optimization applied to practical problems. In particular, the following aspects of shape analysis are considered: 1. A framework adopting the SRNF representation, based on parallel transport of deformations across surfaces in the shape space, leads to statistical analysis on shape data. Three main analyses are conducted under this framework: (1) computing geodesics when either two end surfaces or the starting surface and an initial deformation are given; (2) parallel transporting deformation across surfaces; and (3) sampling random surfaces. 2. Computational efficiency plays an important role in performing statistical shape analysis on large datasets of 3D objects. To speed up the previous method, a framework with numerical solution is introduced by approximating the inverse mapping, and it reduces the computational cost by an order of magnitude. 3. The geometrical and morphological information, or their shapes, of 3D objects can be analyzed explicitly using boundaries extracted from original image scans. An alternative idea is to consider variability in shapes directly from their embedding images. A novel framework is proposed to unify three important tasks, registering, comparing and modeling images. 4. Finally, the spatial deformations learned from registering images are modeled using the GRID based decomposition. This specific model provides a way to decompose a large deformation into local and fundamental ones so that shape differences between images are easily interpretable. We conclude this thesis with conclusions drawn in this research and discuss potential future directions of statistical shape analysis in the last chapter, both from methodological and application aspects.
Show less  Date Issued
 2015
 Identifier
 FSU_migr_etd9495
 Format
 Thesis
 Title
 A Framework for Comparing Shape Distributions.
 Creator

Henning, Wade, Srivastava, Anuj, Alamo, Ruﬁna G., Huﬀer, Fred W. (Fred William), Wu, Wei, Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

The problem of comparisons of shape populations is present in many branches of science, including nanomanufacturing, medical imaging, particle analysis, fisheries, seed science, and computer vision. Researchers in these fields have traditionally characterized the profiles in these sets using combinations of scalar valued descriptor features, like aspect ratio or roughness, whose distributions are easy to compare using classical statistics. However, there is a desire in this community for a...
Show moreThe problem of comparisons of shape populations is present in many branches of science, including nanomanufacturing, medical imaging, particle analysis, fisheries, seed science, and computer vision. Researchers in these fields have traditionally characterized the profiles in these sets using combinations of scalar valued descriptor features, like aspect ratio or roughness, whose distributions are easy to compare using classical statistics. However, there is a desire in this community for a single comprehensive feature that uniquely defines these profiles. The shape of the profile itself is such a feature. Shape features have traditionally been studied as individuals, and comparing distributions underlying sets of shapes is challenging. Since the data comes in the form of samples from shape populations, we use kernel methods to estimate underlying shape densities. We then take a metric approach to define a proper distance, termed the FisherRao distance, to quantify differences between any two densities. This distance can be used for clustering, classification and other types of statistical modeling; however, this dissertation focuses on comparing shape populations as a classical twosample hypothesis test with populations characterized by respective probability densities on shape space. Since we are interested in the shapes of planar closed curves and the space of such curves is infinite dimensional, there are some theoretical issues in defining and estimating densities on this space. We therefore use a spherical multidimensional scaling algorithm to project shape distributions to the unit twosphere, and this allows us to use a von MisesFisher kernel for density estimation. The estimated densities are then compared using the FisherRao distance, which, in turn, is estimated using Monte Carlo methods. This distance estimate is used as a test statistic for the twosample hypothesis test mentioned above. We use a bootstrap approach to perform the test and to evaluate population classification performance. We demonstrate these ideas using applications from industrial and chemical engineering.
Show less  Date Issued
 2014
 Identifier
 FSU_migr_etd9185
 Format
 Thesis
 Title
 Within Study Dependence in MetaAnalysis: Comparison of GLS Method and Multilevel Approaches.
 Creator

Lee, Seungjin, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Paek, Insu, Yang, Yanyun, Florida State University, College of Education, Department of Educational Psychology...
Show moreLee, Seungjin, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Paek, Insu, Yang, Yanyun, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

Multivariate metaanalysis methods typically assume the dependence of effect sizes. One type of experimentaldesign study that generates dependent effect sizes is the multipleendpoint study. While the generalized least squares (GLS) approach requires the sample covariance between outcomes within studies to deal with the dependence of the effect sizes, the univariate threelevel approach does not require the sample covariance to analyze such multivariate effectsize data. Considering that it...
Show moreMultivariate metaanalysis methods typically assume the dependence of effect sizes. One type of experimentaldesign study that generates dependent effect sizes is the multipleendpoint study. While the generalized least squares (GLS) approach requires the sample covariance between outcomes within studies to deal with the dependence of the effect sizes, the univariate threelevel approach does not require the sample covariance to analyze such multivariate effectsize data. Considering that it is rare that primary studies report the sample covariance, if the two approaches produce the same estimates and corresponding standard errors, the univariate threelevel model approach could be an alternative to the GLS approach. The main purpose of this dissertation was to compare these two approaches under the randomeffects model for synthesizing standardized mean differences in multipleendpoints experimental designs using a simulation study. Two data sets were generated under the randomeffects model: one set with two outcomes and the other set with five outcomes. The simulation study in this dissertation found that the univariate threelevel model yielded the appropriate parameter estimates and their standard errors corresponding to those in the multivariate metaanalysis using the GLS approach.
Show less  Date Issued
 2014
 Identifier
 FSU_migr_etd9205
 Format
 Thesis
 Title
 Estimating Sensitivities of Exotic Options Using Monte Carlo Methods.
 Creator

Yuan, Wei, Ökten, Giray, Kim, Kyounghee, Huffer, Fred W. (Fred William), Kercheval, Alec N., Nichols, Warren, Florida State University, College of Arts and Sciences, Department...
Show moreYuan, Wei, Ökten, Giray, Kim, Kyounghee, Huffer, Fred W. (Fred William), Kercheval, Alec N., Nichols, Warren, Florida State University, College of Arts and Sciences, Department of Mathematics
Show less  Abstract/Description

In this dissertation, methods of estimating the sensitivity of complex exotic options, including options written on multiple assets, and have discontinuous payoffs, are investigated. The calculation of the sensitivities (Greeks) is based on the finite difference method, pathwise method, likelihood ratio method and kernel method, via Monte Carlo or quasiMonte Carlo simulation. Direct Monte Carlo estimators for various sensitivities of weather derivatives and mountain range options are given....
Show moreIn this dissertation, methods of estimating the sensitivity of complex exotic options, including options written on multiple assets, and have discontinuous payoffs, are investigated. The calculation of the sensitivities (Greeks) is based on the finite difference method, pathwise method, likelihood ratio method and kernel method, via Monte Carlo or quasiMonte Carlo simulation. Direct Monte Carlo estimators for various sensitivities of weather derivatives and mountain range options are given. The numerical results show that the pathwise method outperforms other methods when the payoff function is Lipschitz continuous. The kernel method and the central finite difference methods are competitive when the payoff function is discontinuous.
Show less  Date Issued
 2015
 Identifier
 FSU_migr_etd9528
 Format
 Thesis
 Title
 MetaAnalysis of Factor Analyses: Comparison of Univariate and Multivariate Approaches Using Correlation Matrices and Factor Loadings.
 Creator

Cho, Kyunghwa, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Paek, Insu, Yang, Yanyun, Florida State University, College of Education, Department of Educational Psychology...
Show moreCho, Kyunghwa, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Paek, Insu, Yang, Yanyun, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

Currently, more sophisticated techniques such as factor analyses are frequently applied in primary research thus may need to be metaanalyzed. This topic has been given little attention in the past due to its complexity. Because factor analysis is becoming more popular in research in many areas including education, social work, social science, and so on, the study of methods for the metaanalysis of factor analyses is also becoming more important. The first main purpose of this dissertation...
Show moreCurrently, more sophisticated techniques such as factor analyses are frequently applied in primary research thus may need to be metaanalyzed. This topic has been given little attention in the past due to its complexity. Because factor analysis is becoming more popular in research in many areas including education, social work, social science, and so on, the study of methods for the metaanalysis of factor analyses is also becoming more important. The first main purpose of this dissertation is to compare the results of seven different approaches to doing metaanalysis of confirmatory factor analyses. Specifically, five approaches are based on univariate metaanalysis methods. The next two approaches use multivariate metaanalysis to obtain the results of factor loadings and the standard errors of factor loadings. The results from each approach are compared. Given the fact that factor analyses are commonly used in many areas, the second purpose of this dissertation is to explore the appropriate approach or approaches to use for the metaanalysis of factor analyses, especially Confirmatory Factor Analysis (CFA). When the average sample size was small, the results of IRD, WMC, WMFL, and GLSMFL approaches showed better performance than those of UMC, MFL, and GLSMC approaches to estimating parameters. With large average sample sizes (larger than 150), the performance to estimate the parameters across all seven approaches seemed to be similar in this dissertation. Based on my simulation results, researchers who want to conduct metaanalytic confirmatory factor analysis can apply any of these approaches to synthesize the results from primary studies it their studies have n > 150.
Show less  Date Issued
 2015
 Identifier
 FSU_migr_etd9570
 Format
 Thesis
 Title
 Exponential Convergence Fourier Method and Its Application to Option Pricing with Lévy Processes.
 Creator

Gu, Fangxi, Nolder, Craig, Huffer, Fred W. (Fred William), Kercheval, Alec N., Nichols, Warren D., Ökten, Giray, Florida State University, College of Arts and Sciences,...
Show moreGu, Fangxi, Nolder, Craig, Huffer, Fred W. (Fred William), Kercheval, Alec N., Nichols, Warren D., Ökten, Giray, Florida State University, College of Arts and Sciences, Department of Mathematics
Show less  Abstract/Description

Option pricing by the Fourier method has been popular for the past decade, many of its applications to Lévy processes has been applied especially for European options. This thesis focuses on exponential convergence Fourier method and its application to discrete monitoring options and Bermudan options. An alternative payoff truncating method is derived to compare the benchmark Hilbert transform. A general error control framework is derived to keep the Fourier method out of an overflow problem....
Show moreOption pricing by the Fourier method has been popular for the past decade, many of its applications to Lévy processes has been applied especially for European options. This thesis focuses on exponential convergence Fourier method and its application to discrete monitoring options and Bermudan options. An alternative payoff truncating method is derived to compare the benchmark Hilbert transform. A general error control framework is derived to keep the Fourier method out of an overflow problem. Numerical results verify that the alternative payoff truncating sinc method performs better than the benchmark Hilbert transform method under the error control framework.
Show less  Date Issued
 2016
 Identifier
 FSU_FA2016_Gu_fsu_0071E_13579
 Format
 Thesis
 Title
 Sparse Feature and Element Selection in HighDimensional Vector Autoregressive Models.
 Creator

Huang, Xue, Niu, Xufeng, She, Yiyuan, Cheng, Yingmei, Huffer, Fred W. (Fred William), Wu, Wei, Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

This thesis is to identify the underlying structures of multivariate time series and propose a methodology to construct predictive VAR models. Due to the complexity of high dimensions in multivariate time series, forecasting a target series with many predictors in VAR models poses a challenge in statistical learning and modeling. The quadratically increasing dimension of parameter space, which is known as "curse of dimensionality" poses considerable challenges to multivariate time series...
Show moreThis thesis is to identify the underlying structures of multivariate time series and propose a methodology to construct predictive VAR models. Due to the complexity of high dimensions in multivariate time series, forecasting a target series with many predictors in VAR models poses a challenge in statistical learning and modeling. The quadratically increasing dimension of parameter space, which is known as "curse of dimensionality" poses considerable challenges to multivariate time series models. Meanwhile, there are two facts involved in reducing dimensions in multivariate time series: first, some nuisance time series exist and better to be removed, second a target time series is typically driven by few dependent elements constructed from some indices. To address these challenge and facts, our approach is to reduce both the dimensions of the series and the features involved in each series simultaneously. As a result, the original high dimensional structure can be modeled using a lower dimensional time series, and subsequently the forecasting performance will be improved. The methodology we introduced in this work is called Sparse Feature and Element Selection (SFES). It employs a "L1 + group L1" penalty to conduct group selection and variable selection within each group simultaneously. Our contributions in this thesis are twofolds. First, the doublyconstrained regularization in SFES is a convex mathematical problem, and we optimize it using a fast but simpletoimplement algorithm. We evaluate this algorithm with a largescale dataset and theoretically prove that it has guaranteed strict iterative convergence and global optimality. Second, we theoretically present nonasymptotic results based on combined statistical and computational analysis. A sharp oracle inequality is proved to reveal its power in predictive learning. We compare SFES with the related work of Sparse Group Lasso (SGL) to show that the proposed method is both computationally efficient and theoretically justified. Experiments using simulation data and realworld macroeconomic time series data are conducted to demonstrate the efficiency and efficacy of the proposed SFES in practice.
Show less  Date Issued
 2016
 Identifier
 FSU_FA2016_Huang_fsu_0071E_13659
 Format
 Thesis
 Title
 Evaluation of Measurement Invariance in IRT Using Limited Information Fit Statistics/Indices: A Monte Carlo Study.
 Creator

Cui, Mengyao, Yang, Yanyun, Paek, Insu, Huffer, Fred W. (Fred William), Becker, Betsy Jane, Binici, Salih, Florida State University, College of Education, Department of...
Show moreCui, Mengyao, Yang, Yanyun, Paek, Insu, Huffer, Fred W. (Fred William), Becker, Betsy Jane, Binici, Salih, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

Measurement invariance analysis is important when test scores are used to make a groupwise comparison. Multiplegroup IRT modeling is one of the commonly used methods for measurement invariance examination. One essential step in the multiplegroup modeling method is the evaluation of overall modeldata fit. A family of limited information fit statistics has been recently developed for assessing the overall modeldata fit in IRT. Previous studies evaluated the performance of limited...
Show moreMeasurement invariance analysis is important when test scores are used to make a groupwise comparison. Multiplegroup IRT modeling is one of the commonly used methods for measurement invariance examination. One essential step in the multiplegroup modeling method is the evaluation of overall modeldata fit. A family of limited information fit statistics has been recently developed for assessing the overall modeldata fit in IRT. Previous studies evaluated the performance of limited information fit statistics using singlegroup data, and found that these fit statistics performed better than the traditional full information fit statistics when data were sparse. However, no study has investigated the performance of the limited information fit statistics within the multiplegroup modeling framework. This study aims to examine the performance of the limited information fit statistic (M₂) and M₂based corresponding descriptive fit indices in conducting measurement invariance analysis within the multiplegroup IRT framework. A Monte Carlo study was conducted to examine sampling distributions of M₂ and M₂based descriptive fit indices, and their sensitivities to lack of measurement invariance under various conditions. The manipulated factors included sample sizes, model types, dimensionality, types and numbers of DIF items, and latent trait distributions. Results showed that the M₂ followed an approximately chisquare distribution when the model was correctly specified, as expected. The type I error rates of M₂ were reasonable under large sample sizes (1000/2000). When the model was misspecified, the power of M₂ was a function of sample size and the number of DIF items. For example, the power of M₂ for rejecting the U2PL Scalar Model increased from 29.2% to 99.9% when the number of uniform DIF items increased from one to six, given the sample sizes of 1000/2000. With six uniform DIF items (30% of the studied items), the power of increased from 42.4% to 99.9% when sample sizes changed from 250/500 to 1000/2000. When the difference in M₂(ΔM₂) was used to compare two correctly specified nested models, the sampling distribution of ΔM₂ appeared to be apart from the reference chisquare distribution at both tails, especially under small sample sizes. The type I error rates of the ΔM₂ test became closer to the expectation when sample sizes increased. For example, both Metric and Configural Models were correctly specified when the test included no DIF item. Given the alpha level of .05, the type I error rates of for the comparsion between the Metric and Configural Model were slightly inflated with n=250/500 (8.72%), and became closer to the alpha level with n=1000/2000 (5.3%). When at least one of the models was misspecified, the power of increased when the number of DIF items or sample sizes became larger. For example, the Metric Model was misspecified when nonuniform DIF item existed. Given sample sizes of 1000/2000 and alpha level of .05, the power of ΔM₂ for the comparison between the Metric and Configural Model increased from 52.55 % to 99.39% when the number of nonuniform DIF items changes from one to six. With one nonuniform DIF item in the test, the power of ΔM₂ was only 17.05% given the alpha level of .05 and sample sizes of 250/500, but increased to 52.55% given the sample sizes of 1000/2000. The descriptive fit indices and their differences between nested models were also affected by the number of DIF items. When there was no DIF item, all fit indices indicated good modeldata fit. The differences of the five fit indices between nested models were all very small (<.008) across different sample sizes. When DIF items existed, the means of descriptive fit indices, and their differences between nested models increased when number of DIF items increased. The finding from this study provided some suggestions about the implementation of the limited information fit statistics/indices in measurement invariance analysis within the multiplegroup IRT framework.
Show less  Date Issued
 2016
 Identifier
 FSU_FA2016_Cui_fsu_0071E_13537
 Format
 Thesis
 Title
 Investigating the ChiSquareBased ModelFit Indexes for WLSMV and ULSMV Estimators.
 Creator

Xia, Yan, Yang, Yanyun, Huffer, Fred W. (Fred William), Almond, Russell G., Becker, Betsy Jane, Paek, Insu, Florida State University, College of Education, Department of...
Show moreXia, Yan, Yang, Yanyun, Huffer, Fred W. (Fred William), Almond, Russell G., Becker, Betsy Jane, Paek, Insu, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

In structural equation modeling (SEM), researchers use the model chisquare statistic and modelfit indexes to evaluate modeldata fit. Root mean square error of approximation (RMSEA), comparative fit index (CFI), and TuckerLewis index (TLI) are widely applied modelfit indexes. When data are ordered and categorical, the most popular estimator is the diagonally weighted least squares (DWLS) estimator. Robust corrections have been proposed to adjust the uncorrected chisquare statistic from...
Show moreIn structural equation modeling (SEM), researchers use the model chisquare statistic and modelfit indexes to evaluate modeldata fit. Root mean square error of approximation (RMSEA), comparative fit index (CFI), and TuckerLewis index (TLI) are widely applied modelfit indexes. When data are ordered and categorical, the most popular estimator is the diagonally weighted least squares (DWLS) estimator. Robust corrections have been proposed to adjust the uncorrected chisquare statistic from DWLS so that its first and second order moments are in alignment with the target central chisquare distribution under correctly specified models. DWLS with such a correction is called the mean and varianceadjusted weighted least squares (WLSMV) estimator. An alternative to WLSMV is the meanand varianceadjusted unweighted least squares (ULSMV) estimator, which has been shown to perform as well as, or slightly better than WLSMV. Because the chisquare statistic is corrected, the chisquarebased RMSEA, CFI, and TLI are thus also corrected by replacing the uncorrected chisquare statistic with the robust chisquare statistic. The robust model fit indexes calculated in such a way are named as the populationcorrected robust (PR) model fit indexes following BrosseauLiard, Savalei, and Li (2012). The PR model fit indexes are currently reported in almost every application when WLSMV or ULSMV is used. Nevertheless, previous studies have found the PR model fit indexes from WLSMV are sensitive to several factors such as sample sizes, model sizes, and thresholds for categorization. The first focus of this dissertation is on the dependency of model fit indexes on the thresholds for ordered categorical data. Because the weight matrix in the WLSMV fit function and the correction factors for both WLSMV and ULSMV include the asymptotic variances of thresholds and polychoric correlations, the model fit indexes are very likely to depend on the thresholds. The dependency of model fit indexes on the thresholds is not a desirable property, because when the misspecification lies in the factor structures (e.g., cross loadings are ignored or two factors are considered as a single factor), model fit indexes should reflect such misspecification rather than the threshold values. As alternatives to the PR model fit indexes, BrosseauLiard et al. (2012), BrosseauLiard and Savalei (2014), and Li and Bentler (2006) proposed the samplecorrected robust (SR) model fit indexes. The PR fit indexes are found to converge to distorted asymptotic values, but the SR fit indexes converge to their definitions asymptotically. However, the SR model fit indexes were proposed for continuous data, and have been neither investigated nor implemented in SEM software when WLSMV and ULSMV are applied. This dissertation thus investigates the PR and SR model fit indexes for WLSMV and ULSMV. The first part of the simulation study examines the dependency of the model fit indexes on the thresholds when the model misspecification results from omitting crossloadings or collapsing factors in confirmatory factor analysis. The study is conducted on extremely large computergenerated datasets in order to approximate the asymptotic values of model fit indexes. The results find that only the SR fit indexes from ULSMV are independent of the population threshold values, given the other design factors. The PR fit indexes from ULSMV, and the PR and SR fit indexes from WLSMV are influenced by thresholds, especially when data are binary and the hypothesized model is greatly misspecified. The second part of the simulation varies the sample sizes from 100 to 1000 to investigate whether the SR fit indexes under finite samples are more accurate estimates of the defined values of RMSEA, CFI, and TLI, compared with the uncorrected model fit indexes without robust correction and the PR fit indexes. Results show that the SR fit indexes are the more accurate in general. However, when the thresholds are different across items, data are binary, and sample size is less than 500, all versions of these indexes can be very inaccurate. In such situations, larger sample sizes are needed. In addition, the conventional cutoffs developed from continuous data with maximum likelihood (e.g., RMSEA < .06, CFI > .95, and TLI > .95; Hu & Bentler, 1999) have been applied to WLSMV and ULSMV regardless of the arguments against such a practice (e.g., Marsh, Hau, & Wen, 2004). For comparison purposes, this dissertation reports the RMSEA, CFI, and TLI based on continuous data using maximum likelihood before the variables are categorized to create ordered categorical data. Results show that the model fit indexes from maximum likelihood are very different from those from WLSMV and ULSMV, suggesting that the conventional rules should not be applied to WLSMV and ULSMV.
Show less  Date Issued
 2016
 Identifier
 FSU_2016SU_Xia_fsu_0071E_13379
 Format
 Thesis
 Title
 Loglinear Model as a DIF Detection Method for Dichotomous and Polytomous Items and Its Comparison with Other Observed Score Matching DIF Methods.
 Creator

Yesiltas, Gonca, Paek, Insu, Huffer, Fred W. (Fred William), Becker, Betsy Jane, Almond, Russell G., Florida State University, College of Education, Department of Educational...
Show moreYesiltas, Gonca, Paek, Insu, Huffer, Fred W. (Fred William), Becker, Betsy Jane, Almond, Russell G., Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

DIF detection methods identify the difference between the performances of subgroups when the subgroups are matched by examinees' ability level or a proxy variable, such as total test score (Holland & Wainer, 1993). Loglinear Models (LLM) method is one of the DIF detection methods. This method was first introduced by Mellenbergh (1982) to investigate the relationship among item responses, subgroups, and categorized total test score in terms of DIF detection. This study examined the...
Show moreDIF detection methods identify the difference between the performances of subgroups when the subgroups are matched by examinees' ability level or a proxy variable, such as total test score (Holland & Wainer, 1993). Loglinear Models (LLM) method is one of the DIF detection methods. This method was first introduced by Mellenbergh (1982) to investigate the relationship among item responses, subgroups, and categorized total test score in terms of DIF detection. This study examined the performance of LLM as a DIF detection method for dichotomous items and polytomous items. LLM method was compared with MantelHaenszsel (MH) and logistic regression (LR) methods to detect uniform DIF and with LR to detect nonuniform DIF in dichotomous item response data. MH was not included in nonuniform DIF detection, because, the previous studies indicated that it is not able to detect nonuniform DIF (Narayanon & Swaminathan, 1996; Uttaro & Milsap, 1994). In addition, LLM was compared with Mantel, generalized MantelHaenszsel (GMH), ordinal logistic regression (OLR), logistic discriminate function analysis (LDFA) methods in polytomous item response data. For this purpose, both simulation study and empirical study were conducted under various sample sizes, ability mean differences (impact) and item parameters. Since the previous studies did not investigate the effect of ability mean differences on DIF detection with LLM, this study also focused on the effect of ability mean differences between subgroups. This study found that MH was better to detect uniform DIF when LR and LLM indicated equally well performance on uniform and nonuniform DIF detection. In Addition, GMH and LLM performed better than Mantel, OLR, and LDFA for the polytomous item response data.
Show less  Date Issued
 2016
 Identifier
 FSU_2016SP_Yesiltas_fsu_0071E_13119
 Format
 Thesis
 Title
 The Impact of Competition on Elephant Musth Strategies: A Gametheoretic Model.
 Creator

Wyse, J. Maxwell (John Maxwell), MestertonGibbons, Mike, Huffer, Fred W. (Fred William), Hurdal, Monica K., Cogan, Nicholas G., Florida State University, College of Arts and...
Show moreWyse, J. Maxwell (John Maxwell), MestertonGibbons, Mike, Huffer, Fred W. (Fred William), Hurdal, Monica K., Cogan, Nicholas G., Florida State University, College of Arts and Sciences, Department of Mathematics
Show less  Abstract/Description

Mature male African elephants are known to periodically enter a temporary state of heightened aggression called "musth," often linked with increased androgens, particularly testosterone. Sexually mature males are capable of entering musth at any time of year, and will often travel long distances to find estrous females. When two musth bulls or two nonmusth bulls encounter one another, the agonistic interaction is usually won by the larger male. When a smaller musth bull encounters a larger...
Show moreMature male African elephants are known to periodically enter a temporary state of heightened aggression called "musth," often linked with increased androgens, particularly testosterone. Sexually mature males are capable of entering musth at any time of year, and will often travel long distances to find estrous females. When two musth bulls or two nonmusth bulls encounter one another, the agonistic interaction is usually won by the larger male. When a smaller musth bull encounters a larger nonmusth bull, however, the smaller musth male can win. The relative mating success of musth males is due partly to this fighting advantage, and partly to estrous females' general preference for musth males. Though musth behavior has long been observed and documented, the evolutionary advantages of musth remain poorly understood. Here we develop a gametheoretic model of male musth behavior which assumes musth duration as a parameter, and distributions of small, medium and large musth males are predicted in both time and space. The predicted results are similar to the observed timing strategies in the Amboseli National Park elephant population. We discuss small male musth behavior, musthestrus coincidence, the effects of estrous female spatial heterogeneity on musth timing, conservation applications, the assumptions underpinning the model and possible modifications to the model for the purpose of determining musth duration.
Show less  Date Issued
 2017
 Identifier
 FSU_2017SP_Wyse_fsu_0071E_13713
 Format
 Thesis
 Title
 Random Sobol' Sensitivity Analysis and Model Robustness.
 Creator

Mandel, David, Ökten, Giray, Hussaini, M. Yousuff, Huffer, Fred W. (Fred William), Kercheval, Alec N., Fahim, Arash, Florida State University, College of Arts and Sciences,...
Show moreMandel, David, Ökten, Giray, Hussaini, M. Yousuff, Huffer, Fred W. (Fred William), Kercheval, Alec N., Fahim, Arash, Florida State University, College of Arts and Sciences, Department of Mathematics
Show less  Abstract/Description

This work develops both the theoretical foundation and the practical application of random Sobol' analysis with two goals. The first is to provide a more general and accommodating approach to global sensitivity analysis, in which the parameter distribution themselves contain uncertainty, and hence the sensitivity results are random quantities as well. The framework for this approach is motivated by empirical evidence of such behavior, and examples of this behavior in interest rate and...
Show moreThis work develops both the theoretical foundation and the practical application of random Sobol' analysis with two goals. The first is to provide a more general and accommodating approach to global sensitivity analysis, in which the parameter distribution themselves contain uncertainty, and hence the sensitivity results are random quantities as well. The framework for this approach is motivated by empirical evidence of such behavior, and examples of this behavior in interest rate and temperature modeling are provided. The second goal is to compare competing models on their robustness, a notion developed and defined to provide a quantitative solution to model selection based on model uncertainty and sensitivity
Show less  Date Issued
 2017
 Identifier
 FSU_2017SP_Mandel_fsu_0071E_13682
 Format
 Thesis
 Title
 TimeVarying Mixture Models for Financial Risk Management.
 Creator

Zhang, Shuguang, Niu, Xufeng, Cheng, Yingmei, Huffer, Fred W. (Fred William), Tao, Minjing, Florida State University, College of Arts and Sciences, Department of Statistics
 Abstract/Description

Motivated by understanding the devastating financial crisis in 2008 that was partially caused by underestimation of financial risk, we propose a class of timevarying mixture models for risk analysis and management. There are various metrics for financial risk including value at risk (VaR), expected shortfall, expected / unexpected loss, etc. In this study we focus on VaR and one commonly used method to estimate VaR is the VarianceCovariance method, in which normal distribution is usually...
Show moreMotivated by understanding the devastating financial crisis in 2008 that was partially caused by underestimation of financial risk, we propose a class of timevarying mixture models for risk analysis and management. There are various metrics for financial risk including value at risk (VaR), expected shortfall, expected / unexpected loss, etc. In this study we focus on VaR and one commonly used method to estimate VaR is the VarianceCovariance method, in which normal distribution is usually assumed for asset returns that may underestimate the real risk. To address this issue, in this study we propose a series of twocomponent mixture models  one component is normal distribution and the other is a fattailed distribution such as Cauchy distribution, student's tdistribution or Gumbel distribution. Instead of assuming distribution parameters and weights to be constant, we allow them to change over time which guarantees exibility of our models. Monte Carlo ExpectationMaximization method and Monte Carlo maximum likelihood estimation were used for parameter estimation. Simulation studies are conducted and the models are applied in stock market price data.
Show less  Date Issued
 2016
 Identifier
 FSU_2016SP_Zhang_fsu_0071E_13150
 Format
 Thesis
 Title
 An Evaluation of Four Methods for Determining the Number of Factors Underlying Measurement Indicators under the Presence of Guessing Effects.
 Creator

Cukadar, Ismail, Yang, Yanyun, Huffer, Fred W. (Fred William), Becker, Betsy Jane, Binici, Salih, Paek, Insu, Florida State University, College of Education, Department of...
Show moreCukadar, Ismail, Yang, Yanyun, Huffer, Fred W. (Fred William), Becker, Betsy Jane, Binici, Salih, Paek, Insu, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less  Abstract/Description

In factor analysis, determining the number of factors underlying measurement indicators is important. An incorrect decision on the number of factors may mislead practitioners in terms of estimating parameters in factor analysis, reporting students' scores, calibrating items through an item response theory model, equating or linking different test forms, estimating reliability, examining differential item functioning, and investigating validity. Exploratory factor analysis, parallel analysis,...
Show moreIn factor analysis, determining the number of factors underlying measurement indicators is important. An incorrect decision on the number of factors may mislead practitioners in terms of estimating parameters in factor analysis, reporting students' scores, calibrating items through an item response theory model, equating or linking different test forms, estimating reliability, examining differential item functioning, and investigating validity. Exploratory factor analysis, parallel analysis, Kaiser's rule, and Cattell's scree test are commonly used methods for deciding on the number of factors in educational and psychological assessments. When a test consists of multiplechoice or ordinalscaled items, some test takers might find the correct answers to the items through guessing. The guessing effect might impact the correlation coefficients among items. Thus, it might also impact the decisions on the number of factors via exploratory factor analysis, parallel analysis, Kaiser's rule, and Cattell's scree test. These four methods do not consider the guessing effects through modeling a guessing parameter when examining the dimensionality of data. The main purpose of this study is to investigate the impact of guessing on the performance of exploratory factor analysis, parallel analysis, Kaiser's rule, and Cattell's scree test in determining the number of factors underlying measurement indicators. Among these four methods, Cattell's scree test is a subjective method because the determination of the elbow point in the scree plot requires the user to make a judgmental call. Therefore, another purpose of this study is to propose a method that may allow for a more objective evaluation of Cattell's scree test, specifically, through calculating angles in the scree plot. A Monte Carlo study was conducted to examine the performance of exploratory factor analysis, Kaiser's rule, parallel analysis, and the revised scree test in determining the dimensionality of data when guessing effects were present. The following design factors were manipulated: factor structure, sample size, test length, the number of factors, values of the pseudoguessing parameters, and the correlation between factors. The study results showed that all four methods performed worse for determining the number of factors under the presence of guessing effects than under the absence of guessing effect. In other words, none of the four methods was robust to the presence of guessing effects. Among the four methods, parallel analysis performed the best. The study results also showed that all four methods tended to retain fewer factors as the guessing effects became greater. Across all levels of guessing effects, parallel analysis was the best method for identifying the number of factors under conditions with simple structures, while exploratory factor analysis using the chisquare difference test was the best method for determining the dimensionality of bifactor models. In terms of the methods for estimating polychoric correlations, the maximum likelihood and Bayesian methods performed almost identically and led to similar estimated numbers of factors via the four methods. The current study design indicated that two different cutoff values were reasonable to use for determining the number of factors via the revised Cattell's scree test: 161 for simplestructure models and 173 for bifactor models. The revised Cattell's scree test performed better for determining the number of factors under conditions with simple structures than with bifactor models using these two cutoff values. Although practitioners and researchers may consider using the revised Cattell's scree test to evaluate a scree plot in a more objective way, it is important to use the indicated cutoff values with caution in that they may not be applicable under other study conditions.
Show less  Date Issued
 2019
 Identifier
 2019_Summer_Cukadar_fsu_0071E_15052
 Format
 Thesis
 Title
 Shape Based Function Estimation.
 Creator

Dasgupta, Sutanoy, Srivastava, Anuj, Pati, Debdeep, Klassen, E. (Eric), Huffer, Fred W. (Fred William), Wu, Wei, Florida State University, College of Arts and Sciences,...
Show moreDasgupta, Sutanoy, Srivastava, Anuj, Pati, Debdeep, Klassen, E. (Eric), Huffer, Fred W. (Fred William), Wu, Wei, Florida State University, College of Arts and Sciences, Department of Statistics
Show less  Abstract/Description

Estimation of functions is an extremely rich and wellresearched topic of research with broad applications spanning several scientific fields. We develop a shape based framework for probability density and general function modelling. The framework encompasses both shape constrained and unconstrained estimation, and can accomodate a much broader notion of shape constraints than has been considered in literature. The estimation approach is a two step process where the first step creates a...
Show moreEstimation of functions is an extremely rich and wellresearched topic of research with broad applications spanning several scientific fields. We develop a shape based framework for probability density and general function modelling. The framework encompasses both shape constrained and unconstrained estimation, and can accomodate a much broader notion of shape constraints than has been considered in literature. The estimation approach is a two step process where the first step creates a template or an initial guess, and the second important step ``improves" the estimate according to an appropriate objective function. We derive asymptotic properties of the estimators in different scenarios, and illustrate the performance of the estimate through several simulation as well as real data examples.
Show less  Date Issued
 2019
 Identifier
 2019_Summer_Dasgupta_fsu_0071E_15347
 Format
 Thesis