Current Search: Research Repository (x) » Statistics (x) » College of Arts and Sciences (x)
Search results
Pages
 Title
 LUMPABILITY AND WEAK LUMPABILITY IN FINITE MARKOV CHAINS.
 Creator

ABDELMONEIM, ATEF MOHAMED., Florida State University
 Abstract/Description

Consider a Markov chain x(t), t = 0, 1, 2, ..., with a finite state space, N = {1, 2, ..., n}, transition probability matrix P = (p(,ij)) i, j (epsilon) N, and an initial probability vector V = (v(,i)) i (epsilon) N. For m (LESSTHEQ) n let A = {A(,1), A(,2), ..., A(,m)} be a partition on the set N. Define the process, (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI), The new process y(t), called a function of Markov chain, need not be Markov. If y(t) is again Markov, whatever the initial...
Show moreConsider a Markov chain x(t), t = 0, 1, 2, ..., with a finite state space, N = {1, 2, ..., n}, transition probability matrix P = (p(,ij)) i, j (epsilon) N, and an initial probability vector V = (v(,i)) i (epsilon) N. For m (LESSTHEQ) n let A = {A(,1), A(,2), ..., A(,m)} be a partition on the set N. Define the process, (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI), The new process y(t), called a function of Markov chain, need not be Markov. If y(t) is again Markov, whatever the initial probability vector of x(t), x(t) is said to be lumped to y(t) with respect to the partition A. If y(t) is again Markov for only certain initial probability vectors of x(t), x(t) is said to be weakly lumped to y(t) with respect to the partition A., Conditions under which x(t) can be lumped or weakly lumped to y(t) with respect to A, are introduced. Relationships between the two processes x(t) and y(t) and the properties of the new process y(t) are discussed., Criteria are developed to determine whether a given Markov chain can be weakly lumped with respect to a given partition in terms of an analysis of systems of linear equations. Necessary and sufficient conditions on the transition probability matrix of a Markov chain, a partition, A, on N and a subset S of probability vectors for weak lumpability to occur are given in terms of the solution classes to these systems of linear equations. Finally, given that weak lumping occurs, the class S of all initial probability vectors which allow weak lumping is determined as is the transition probability matrix of the lumped process, y(t)., Lumpability and weak lumpability are also studied for Markov chains which are not irreducible. This involves a study of the interplay between two partitions of the state space N, the partition C, induced by the closed sets of states of the Markov chain and the partition A, with respect to which lumpability is to be considered. Under the assumptions that lumpability occurs the relationships which must exist between sets of the two partitions A and C are obtained in detail. It is found, for example that if neither partition is a refinement of the other and (A,C) form an irreducible pair of partitions over N then for each A (epsilon) A and C (epsilon) C, A (INTERSECT) C (NOT=) (phi). Further conditions which the transition probability matrix P must satisfy if lumpability is to hold are obtained as are relationships which must exist between P and P*., Suppose a process y(t) is known to arise as a result of a weak lumping or lumping from some unknown Markov chain x(t). Let (chi)(t) be the class of all Markov chains x(t) with n states which yield this weak lumping or lumping. The problem of characterizing this class and a class S of initial probability vectors which allow this lumping is considered. A complete solution is given when n = 3 and m = 2., The importance of lumpability in application is discussed.
Show less  Date Issued
 1980, 1980
 Identifier
 AAI8109927, 3084860, FSDT3084860, fsu:74361
 Format
 Document (PDF)
 Title
 Testing for a timedependent covariate effect in the linear risk model.
 Creator

Amirsehi, Kourosh., Florida State University
 Abstract/Description

We propose two tests to identify a time dependent covariate effect in the partly parametric linear risk model, and derive asymptotic distributions of the test statistics under the assumption that the covariate effect of interest is constant. One of the asymptotic distributions depends on unknown functions and we devise a weighted bootstrap procedure to estimate its quantiles. We also derive rates of convergence of maximum likelihood estimators of regression coefficients in both the...
Show moreWe propose two tests to identify a time dependent covariate effect in the partly parametric linear risk model, and derive asymptotic distributions of the test statistics under the assumption that the covariate effect of interest is constant. One of the asymptotic distributions depends on unknown functions and we devise a weighted bootstrap procedure to estimate its quantiles. We also derive rates of convergence of maximum likelihood estimators of regression coefficients in both the nonparametric and the partly parametric linear risk models using the method of sieves. We carry a simulation study to assess the performance of the proposed test and apply it to real data from a clinical trial on myelomatosis.
Show less  Date Issued
 1995, 1995
 Identifier
 AAI9620872, 3088860, FSDT3088860, fsu:77659
 Format
 Document (PDF)
 Title
 Identifiability in the autopsy model of reliability theory.
 Creator

Antoine, Robin Michael., Florida State University
 Abstract/Description

Let S be a coherent system of m components acting independently. Two statistical models are considered. In the autopsy model S is observed until it fails. The set of failed components and the failure time of the system are noted. The failure times of the dead components are not known. In the second model, which was considered by Doss, Freitag and Proschan (Ann. Statist., 1989), the failure times of the dead components are also known., In the autopsy model, it is not always possible to...
Show moreLet S be a coherent system of m components acting independently. Two statistical models are considered. In the autopsy model S is observed until it fails. The set of failed components and the failure time of the system are noted. The failure times of the dead components are not known. In the second model, which was considered by Doss, Freitag and Proschan (Ann. Statist., 1989), the failure times of the dead components are also known., In the autopsy model, it is not always possible to estimate or identify the component lifelengths from the observed data. A sufficient condition for the identifiability of the component distributions is given for the case in which the distributions are assumed to be analytic. Necessary and sufficient conditions are given for the case in which the distributions are assumed to belong to certain parametric families., The model of Doss, Freitag and Proschan is considered in two special cases. In the first of these the component distributions are known to be identical. In the second, the distributions are known to be exponential. Estimators of the component and system life lengths are given for each of these cases, and the asymptotic relative efficiency of each with respect to the corresponding estimator of Doss, Freitag and Proschan is calculated.
Show less  Date Issued
 1992, 1992
 Identifier
 AAI9222356, 3087814, FSDT3087814, fsu:76624
 Format
 Document (PDF)
 Title
 PARTIAL SEQUENTIAL TESTS FOR THE MEAN OF A NORMAL DISTRIBUTION.
 Creator

ARGHAMI, NASSER REZA., Florida State University
 Abstract/Description

Recently, Billard (1977) introduced a truncated partial sequential procedure for testing a null hypothesis about a normal mean with known variance against a twosided alternative hypothesis. That procedure had the disadvantage that a large number of observations is necessary if the null hypothesis is to be accepted. A new procedure is introduced which reduces the expected sample size for all mean values with considerable reductions for values near the null mean value. Theoretical operating...
Show moreRecently, Billard (1977) introduced a truncated partial sequential procedure for testing a null hypothesis about a normal mean with known variance against a twosided alternative hypothesis. That procedure had the disadvantage that a large number of observations is necessary if the null hypothesis is to be accepted. A new procedure is introduced which reduces the expected sample size for all mean values with considerable reductions for values near the null mean value. Theoretical operating characteristic and average sample number functions are derived, and the empirical distribution of the sample size in some special cases is obtained., For the case of unknown variance and a onesided alternative hypothesis, there are a number of tests, the best known of which are those of Wald (1947) and Barnard (1952). These tests have concerned themselves with tests for units of (mu)/(sigma). In this work, a partial sequential test procedure is introduced for hypotheses concerned only with (mu). An advantage of this new procedure is its relative simplicity and ease of execution when compared to the above tests. This is essentially due to the fact that in the present procedure the transformed observations follow a central tdistribution as distinct from the noncentral tdistribution. The difficulties caused by the noncentral distribution explain the relative lack of progress in obtaining the results about the properties, such as the operating characteristic and average sample number functions, of the tests of Barnard and Wald. The key element in the present procedure is that a number of observations is taken initially before any decision is made; subsequent observations are then taken in batches, the sizes of which depend on the estimate for the variance obtained from the initial set of observations. Some properties of the procedure are studied. In particular, an approximation to the theoretical operating characteristic function is derived and the sensitivity of the average sample number function to changes in some of the test parameters is investigated., The ideas developed for the partial sequential ttest are extended to develop tests of hypotheses concerning the parameters of a simple linear regression equation, general linear hypotheses and hypotheses about the mean of special cases of the multivariate normal.
Show less  Date Issued
 1981, 1981
 Identifier
 AAI8125865, 3085070, FSDT3085070, fsu:74568
 Format
 Document (PDF)
 Title
 SOME RESULTS ON THE DISTRIBUTION OF GRUBBS ESTIMATORS.
 Creator

BRINDLEY, DENNIS ALFRED., Florida State University
 Abstract/Description

This dissertation is concerned with the estimation of error variances in a nonreplicated twoway classification and with inferences based on the estimators so derived. The postulated model used throughout the present work is, y(,ij) = (mu)(,i) + (beta)(,j) + (epsilon)(,ij),, where y(,ij) is the observation in the i('th) row and j('th) column, (mu)(,i) is the parameter representing the mean of the i('th) row, (beta)(,j) is the parameter representing the additional effect of the j('th) column,...
Show moreThis dissertation is concerned with the estimation of error variances in a nonreplicated twoway classification and with inferences based on the estimators so derived. The postulated model used throughout the present work is, y(,ij) = (mu)(,i) + (beta)(,j) + (epsilon)(,ij),, where y(,ij) is the observation in the i('th) row and j('th) column, (mu)(,i) is the parameter representing the mean of the i('th) row, (beta)(,j) is the parameter representing the additional effect of the j('th) column,, (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI), and the (epsilon)(,ij) are independent, zeromean, normal variates with, (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI), A set of unbiased estimates, (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI), developed in earlier work by Grubbs (J. AMER. STATIST. ASSOC. 43 (1948), 243264), Ehrenberg (BIOMETRIKA 37 (1950), 347357) and Russell and Bradley (BIOMETRIKA 45 (1958), 111129) are considered., The exact joint density of Q(,1), ..., Q(,r) is obtained for r = 3 and two exact results are derived for testing the null hypothesis,, (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI), unknown, versus the two specific alternatives,, (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI), for at least some j, j = 1, 2, 3, and,, (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI)
Show less  Date Issued
 1982, 1982
 Identifier
 AAI8229146, 3085401, FSDT3085401, fsu:74896
 Format
 Document (PDF)
 Title
 KLEMS translog cost estimates and energy elasticities.
 Creator

Campbell, Timothy Alan., Florida State University
 Abstract/Description

Data from the Bureau of Labor Statistics (BLS) for capital, labor, energy, materials, and business services (KLEMS) are used to estimate translog cost functions. Much of the work developing and testing production and cost functions has used the same Berndt and Wood (BW) data for total manufacturing. Results from the BLS are compared with the BW data and considerable differences found., To improve the translog estimates the Kalman filter and state space form are used in an effort to permit the...
Show moreData from the Bureau of Labor Statistics (BLS) for capital, labor, energy, materials, and business services (KLEMS) are used to estimate translog cost functions. Much of the work developing and testing production and cost functions has used the same Berndt and Wood (BW) data for total manufacturing. Results from the BLS are compared with the BW data and considerable differences found., To improve the translog estimates the Kalman filter and state space form are used in an effort to permit the time proxy for technological change to follow a random walk with drift. The general state space form provides a unified structure that subsumes other models. After smoothing the Kalman filter model is equivalent to including time proxy., An errorcorrection model or ECM is used to make the translog specification more dynamic. Nested within the most general ECM specification are the more restrictive static, partial adjustment, and autoregressive models. Likelihood ratio tests reject the more restricted models in favor of the general ECM specification, but theoretical symmetry and addingup restrictions are rejected for most twodigit Standard Industrial Code industries using the general ECM specification. Elasticities are computed for total manufacturing and compared with those found in other studies with a special emphasis on energy. Many violations of the monotonic, ownprice, and concavity theoretical requirements are found.
Show less  Date Issued
 1993, 1993
 Identifier
 AAI9410157, 3088225, FSDT3088225, fsu:77029
 Format
 Document (PDF)
 Title
 ON DETERMINING THE NUMBER OF PREDICTORS IN A REGRESSION EQUATION USED FOR PREDICTION.
 Creator

CARR, MEG BRADY., Florida State University
 Abstract/Description

It is generally recognized that all the available variables should not necessarily be used as predictors in a linear regression equation. The problems which may arise from using too many predictors become especially acute in a regression equation used for prediction with independent data. In this case, the skill of prediction may actually deteriorate with increasing numbers of predictors. However, there is no definitive explanation as to why this should be so. There is also no universally...
Show moreIt is generally recognized that all the available variables should not necessarily be used as predictors in a linear regression equation. The problems which may arise from using too many predictors become especially acute in a regression equation used for prediction with independent data. In this case, the skill of prediction may actually deteriorate with increasing numbers of predictors. However, there is no definitive explanation as to why this should be so. There is also no universally accepted procedure for determining the number of predictors to use. The various regression methods which do exist are logically contrived but are also largely based on subjective considerations., The goal of this research is to develop and test a criterion that will indicate a priori the "optimum" number of predictors to use in a prediction equation. The mean square error statistic is used to evaluate the performance of a regression equation in both the dependent and independent samples. Selecting the "best" prediction equation consists of determining the equation with the minimum estimated independent sample mean square error. Several approximations and estimators of the independent sample mean square error which have appeared in the literature are discussed and two new estimators are derived., These approximations and estimators are tested in Monte Carlo simulations to determine their skill in indicating the number of predictors which will yield the best prediction equation. The sample size, number of available predictors, correlations among the variables, distribution of the variables, and selection method are manipulated to explore how these various factors influence the performances of the mean square error estimators. It is found that the better estimators are capable of indicating a number of predictors to include in the regression equation for which the corresponding independent sample mean square error is near the minimum value., As a practical test, the various estimators of the independent sample mean square error are applied to the data used in deriving the Model Output Statistics (MOS) maximum and minimum temperature forecast equations used by the National Weather Service. These prediction equations are linear regression equations derived using a forward selection method. The sequence of prediction equations corresponding to the forward trace of all the available predictors is derived for each of 192 cases and then applied to independent data. The forecasts made by the operational p = 10 predictor MOS equations are compared with those made by the equations determined by the estimators of the independent sample mean square error. The operational equations have the best overall verification statistics. The estimators persistently underestimate the values of the independent sample mean square error, but one of the new estimators is able to determine MOS forecast equations that perform as well as the operational equations. Furthermore, it is able to accomplish this without the use of an independent sample to help determine the optimum number of predictors.
Show less  Date Issued
 1980, 1980
 Identifier
 AAI8026121, 3084691, FSDT3084691, fsu:74192
 Format
 Document (PDF)
 Title
 LARGE DEVIATION LOCAL LIMIT THEOREMS, WITH APPLICATIONS.
 Creator

CHAGANTY, NARASINGA RAO., Florida State University
 Abstract/Description

Let {X(,n), n (GREATERTHEQ) 1} be a sequence of i.i.d. random variables withE(X(,1)) = 0, Var(X(,1)) = 1. Let (psi)(s) be the cumulant generating function (c.g.f.) and, (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI), be the large deviation rate of X(,1). Let S(,n) = X(,1) + ... + X(,n). Under some mild conditions on (psi), Richter (Theory Prob. Appl. (1957) 2, 206219) showed that the probability density function f(,n) of(' )S(,n)/SQRT.(n has the asymptotic expression, (DIAGRAM, TABLE...
Show moreLet {X(,n), n (GREATERTHEQ) 1} be a sequence of i.i.d. random variables withE(X(,1)) = 0, Var(X(,1)) = 1. Let (psi)(s) be the cumulant generating function (c.g.f.) and, (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI), be the large deviation rate of X(,1). Let S(,n) = X(,1) + ... + X(,n). Under some mild conditions on (psi), Richter (Theory Prob. Appl. (1957) 2, 206219) showed that the probability density function f(,n) of(' )S(,n)/SQRT.(n has the asymptotic expression, (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI), whenever x(,n) = o(SQRT.(n) and SQRT.(n x(,n) > 1. In this dissertation we obtain similar large deviation local limit theorems for arbitrary sequences of random variables, not necessarily sums of i.i.d. random variables, thereby increasing the applicability of Richter's theorem. Let {T(,n), n (GREATERTHEQ) 1} be an arbitrary sequence of nonlattice random variables with characteristic function (c.f.) (phi)(,n). Let (psi)(,n), (gamma)(,n) be the c.g.f. and the large deviation rate of T(,n)/n. The main theorem in Chapter II shows that under some standard conditions on (psi)(,n), which imply that T(,n)/n converges to a constant in probability, the density function K(,n) of T(,n)/n has the asymptotic expression, (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI), where m(,n) is any sequence of real numbers and (tau)(,n) is defined by(psi)(,n)'((tau)(,n)) = m(,n). When T(,n) is the sum of n i.i.d. random variables our result reduces to Richter's theorem. Similar theorems for lattice valued random variables are also presented which are useful in obtaining asymptotic probabilities for Wilcoxon signedrank test statistic and Kendall's tau., In Chapter III we use the results of Chapter II to obtain central limit theorem for sums of a triangular array of dependent random variables X(,j)('(n)), j = 1, ..., n with joint distribution given by z(,n)('1)exp{H(,n)(x(,1), ..., x(,n))}(PI)dP(x(,j)), where x(,i) (ELEM) R (FOR ALL) i (GREATERTHEQ) 1. The function H(,n)(x(,1), ..., x(,n)) is known as the Hamiltonian. Here P is a probability measure on R. When H(,n)(x(,1), ..., x(,n)) = log (phi)(,n)(s(,n)/n), where s(,n) = x(,1) + ... + x(,n) and the probability measure P satisfies appropriate conditions, we show that there exists an integer r (GREATERTHEQ) 1 and a sequence (tau)(,n) such that (S(,n)  n(tau)(,n))/n('1 1/2r) has a limiting distribution which is nonGaussian if r (GREATERTHEQ) 2. This result generalizes the theorems of JongWoo Jeon (Ph.D. Thesis, Dept. of Stat., F.S.U. (1979)) and Ellis and Newman (Z. Wahrscheinlichkeitstheorie und Verw. Gebiete. (1978) 44, 117139). Chapters IV and V extend the above to the multivariate case.
Show less  Date Issued
 1982, 1982
 Identifier
 AAI8225279, 3085419, FSDT3085419, fsu:74914
 Format
 Document (PDF)
 Title
 PARTIAL ORDERINGS, WITH APPLICATIONS TO RELIABILITY (PARTIAL ORDERINGS, SCHUROSTROWSKI THEOREM, INEQUALITIES).
 Creator

CHAN, WAI TAT., Florida State University
 Abstract/Description

This dissertation is a contribution to the use of inequalities in reliability theory. Specifically, we study three partial orderings, develop some useful properties of these orderings, and apply them to obtain several applications in reliability., The first partial ordering is the notion of convexordering among life distributions. This is in the spirit of Hardy, Littlewood, and Polya (1952) who introduced the concept of relative convexity. Many parametric families of distribution functions...
Show moreThis dissertation is a contribution to the use of inequalities in reliability theory. Specifically, we study three partial orderings, develop some useful properties of these orderings, and apply them to obtain several applications in reliability., The first partial ordering is the notion of convexordering among life distributions. This is in the spirit of Hardy, Littlewood, and Polya (1952) who introduced the concept of relative convexity. Many parametric families of distribution functions encountered in reliability theory are convexordered. Different coherent structures can also be compared with respect to this partial ordering., The second partial ordering is the ordering of majorization among integrable functions. This ordering is a generalization of the majorization ordering of Hardy, Littlewood, and Polya (1952) for vectors in ndimensional Euclidean spaces. The concept of majorization among vectors plays a fundamental role in establishing various inequalities. These inequalities can be recast as statements that certain functions are increasing with respect to the ordering of majorization. Such functions are called Schurconvex functions. An important result in the theory of majorization is the SchurOstrowski Theorem, which characterizes Schurconvex functions. A functional defined on the space of integrable functions is said to be Schurconvex if it is increasing with respect to the ordering of majorization. We obtain an analogue of the SchurOstrowski theorem which characterizes Schurconvex functionals in terms of their Gateaux differentials., The third partial ordering is the ordering of unrestricted majorization among integrable functions. This partial ordering is similar to majorization but does not involve the use of decreasing rearrangements. We establish another analogue of the SchurOstrowski Theorem for functionals increasing with respect to the partial ordering of unrestricted majorization.
Show less  Date Issued
 1985, 1985
 Identifier
 AAI8509841, 3086034, FSDT3086034, fsu:75520
 Format
 Document (PDF)
 Title
 ON SEQUENTIAL UNBIASED AND BAYESTYPE ESTIMATES OF PARAMETERS IN A CONTINGENCY TABLE.
 Creator

CHEN, CHENGCHUNG., Florida State University
 Abstract/Description

Estimation of the probability parameters in a contingency table with linear and/or loglinear constraints on the parameters is the principal concern of this thesis. Sequential unbiased estimates of the cell probabilities as well as some Bayes posterior mean type estimates are considered., Chapter I is a review of some earlier work on the sequential unbiased estimation of the probability parameter in a Bernoulli process. The review begins with the classical work of Girshick, Mosteller and...
Show moreEstimation of the probability parameters in a contingency table with linear and/or loglinear constraints on the parameters is the principal concern of this thesis. Sequential unbiased estimates of the cell probabilities as well as some Bayes posterior mean type estimates are considered., Chapter I is a review of some earlier work on the sequential unbiased estimation of the probability parameter in a Bernoulli process. The review begins with the classical work of Girshick, Mosteller and Savage (1946) and some followup studies like Wolfowitz (1946), Savage (1947), Blackwell (1947), Lehmann and Stein (1950), Degroot (1959) and Kagan, Linnik and Rao (1973). In several cases the original proofs have been simplified and the arguments streamlined., Chapter II deals with the problem of sequential unbiased estimation of the parameters in a contingency table with linear and/or loglinear constraints. Multinomial Girschick, Mosteller and Savage (GMS) type stopping rules are discussed and the corresponding unbiased estimates based on the minimal sufficient statistic described. Consistency, in the sence of Wolfowitz (1947), of such estimates is demonstrated. Unbiased estimates of parametric functions like logcontrasts are derived. Sufficient conditions for the completeness of the GMStype stopping rules are given., In Chapter III, the problem of sequential unbiased estimation of the probability parameters in the BradleyTerry (1952) model of paired comparisons is studied.g The BradleyTerry model can be summarized as follows. Suppose that there are t treatments T(,1), ..., T(,t) that can be pairwise compared. The BradleyTerry model postulates that associated with treatement T(,i) is a :strenth" parameter (PI)(,i) > 0, i = 1, ..., t, such that if treatments T(,i) and T(,j) are compared, the probability that T(,i) is preferred to T(,j) is (theta)(,ij) = (PI)(,i)/((PI)(,i) + (PI)(,j)). The model imposes loglinear constraints on he (theta)(,ij)'s so that techniques similar to those in Chapter II may be used to obtain unbiased estimates, based on a sufficient statistic., In Chapter IV, two Bayestype procedures for estimating the multnomial cell probability vector p, in he presence of linear constraints on the parameters, are proposed and illustrated with examples. A general prior is used with the restriction that the moment generating function of the prior exists in a closed form. The estimators are shown to be strongly consistent. Estimation under loglinear constraints is also considered. Finally, Bayestype estimators for the covariance matrix of the cell frequencies are presented for some special cases of linearly and loglinearly constrained problems., Chapter V is concerned with a Bayesian approach to the estimation of parameters in the BradleyTerry model of paired comparisons. It is assumed that the sum of the treatment parameters (PI)(,i) is 1, and a Dirichlet prior for (PI) = ((PI)(,1), ..., (PI)(,t)) is used. Using the induced prior of (theta)(,ij) and Z(,ij) = (PI)(,i) + (PI)(,j), an estimate (PI)(,ij) of (PI)(,i), based on the data arising from the comparisons of(' ) treatments T(,i) and T(,j), is obtained. An estimate of (PI)(,i) based on all the data is a weighted combination of the (PI)(,ij)'s that minimizes a(' ) risk function. Similarly, estimates for logcontrasts of the (PI)(,i)'s areobtained. This technique of estimation is extended to the Lucemodel of multiple comparisons.(,)
Show less  Date Issued
 1981, 1981
 Identifier
 AAI8125818, 3085061, FSDT3085061, fsu:74559
 Format
 Document (PDF)
 Title
 Identifying influential effects in factorial experiments with sixteen runs: Empirical Bayes approaches.
 Creator

Chen, ChingHsiang., Florida State University
 Abstract/Description

To identify influential effects in unreplicated (possibly fractionated) factorial experiments, the effectsparsity assumption (Box and Meyer (1986), Technometrics 28. 1118) has been adopted in many studies. Although this assumption has been traditionally used for outlierdetecting problems, it may not be suitable to describe the effects from factorial experiments. In this research, we examine the effectsparsity approach and propose empirical Bayes methods relaxing this assumption. The study...
Show moreTo identify influential effects in unreplicated (possibly fractionated) factorial experiments, the effectsparsity assumption (Box and Meyer (1986), Technometrics 28. 1118) has been adopted in many studies. Although this assumption has been traditionally used for outlierdetecting problems, it may not be suitable to describe the effects from factorial experiments. In this research, we examine the effectsparsity approach and propose empirical Bayes methods relaxing this assumption. The study also examines the identification of influential effects based on information about the design structure such as the alias relationships, design resolution, and sizes of interactions. A simulation study, based primarily on the criterion of reducing experimental cost of misidentifying factors, has been performed to compare different methods. The results show that when the number of factors is large and when the factorial experiment is highly fractionated, the incorporation of information about the design structure into the analysis reduces the cost in a screening experiment compared to methods not considering design structure.
Show less  Date Issued
 1994, 1994
 Identifier
 AAI9424751, 3088354, FSDT3088354, fsu:77159
 Format
 Document (PDF)
 Title
 ON NONPARAMETRIC ESTIMATION OF DENSITY AND REGRESSION FUNCTIONS.
 Creator

CHENG, PHILIP E., The Florida State University
 Abstract/Description

In the field of statistical estimation, nonparametric procedures have received increased attention for the past decade. In particular, various nonparametric estimates of probability density functions and regression curves have been extensively studied, with special attention to large sample pr
 Date Issued
 1980, 1980
 Identifier
 AAI8020329, 2989654, FSDT2989654, fsu:74161
 Format
 Document (PDF)
 Title
 A comparison of two methods of bootstrapping in a reliability model.
 Creator

Chiang, YuangChin., Florida State University
 Abstract/Description

We consider bootstrapping in the following reliability model which was considered by Doss, Freitag, and Proschan (1987). Available for testing is a sample of iid systems each having the same structure of m independent components. Each system is continuously observed until it fails. For every component in each system, either a failure time or a censoring time is recorded. A failure time is recorded if the component fails before or at the time of system failure; otherwise a censoring time is...
Show moreWe consider bootstrapping in the following reliability model which was considered by Doss, Freitag, and Proschan (1987). Available for testing is a sample of iid systems each having the same structure of m independent components. Each system is continuously observed until it fails. For every component in each system, either a failure time or a censoring time is recorded. A failure time is recorded if the component fails before or at the time of system failure; otherwise a censoring time is recorded. To estimate the distribution of the component lifelengths F$\sb1,\...$,F$\sb{\rm m}$, one can formally compute the KaplanMeier estimates F$\sb1,\...$,F$\sb{\rm m}$. Various quantities of interest, such as the probability that a new system will survive time t$\sb0$, may then be estimated by combining F$\sb1,\...$,F$\sb{\rm m}$ in a suitable way. In this model, bootstrapping can be carried out in two different ways. One can resample n systems at random from the original n systems. Alternatively, one can construct artificial systems by generating independent random lifelengths from the KaplanMeier estimates F$\sb{\rm j}$, and from those form artificial data. The two methods are distinct. We show that asymptotically, bootstrapping by either method yields correct answers. We also compare the two methods via simulation studies.
Show less  Date Issued
 1988, 1988
 Identifier
 AAI8906216, 3161719, FSDT3161719, fsu:77918
 Format
 Document (PDF)
 Title
 Ridge regression: Application to educational data.
 Creator

Churngchow, Chidchanok., Florida State University
 Abstract/Description

Ridge regression is a type of regression technique which was developed to remedy the problem of multicollinearity in regression analysis. The major problem with multicollinearity is that it causes high variances in the estimation of regression coefficients. The ridge model introduces some bias into the regression equation in order to reduce the variance of the estimators. The purposes of this study were to demonstrate the application of the ridge regression model to educational data and to...
Show moreRidge regression is a type of regression technique which was developed to remedy the problem of multicollinearity in regression analysis. The major problem with multicollinearity is that it causes high variances in the estimation of regression coefficients. The ridge model introduces some bias into the regression equation in order to reduce the variance of the estimators. The purposes of this study were to demonstrate the application of the ridge regression model to educational data and to compare the characteristics and performance of the ridge method and the least squares method. In this study, four types of ridge were compared to the least squares method. They were ridge trace, generalized, ordinary and directed ridge., The sample of this study consisted of 141 public schools in Dade County, Florida. The dependent variable was the students' average scores in mathematical computation and reading comprehension. Six variables representing teacher and student characteristics were employed as the predictors. The performance of ridge and the least squares were compared in terms of the confidence interval of an individual estimator and predictive accuracy for the whole model. Since the statistical inference for the ridge method has not been completely developed, the bootstrap technique with a sample size of twenty, was used to calculate the confidence interval of each estimator., The study resulted in a successful application of ridge regression to school level data in which it was found that (1) ridge regression yielded a smaller confidence interval for every estimated regression coefficient and (2) ridge regression produced higher predictive accuracy than ordinary least squares., Since the results were just based on one particular set of data, it cannot be guaranteed that ridge always outperforms the least squares method in all cases.
Show less  Date Issued
 1988, 1988
 Identifier
 AAI8805652, 3086742, FSDT3086742, fsu:76217
 Format
 Document (PDF)
 Title
 A hypothesis test of cumulative sums of multinomial parameters.
 Creator

Clair, James Hunter., Florida State University
 Abstract/Description

Consider $N$ times to repair, $T\sb1,T\sb2\cdots,T\sb{N}$, from a repair time distribution function $F(\cdot)$. Let $p\sb{0~1},p\sb{0~2},\cdots,p\sb{0~K}$ be $K$ proportions with $\sum\sbsp{\nu =1}{K}p\sb{0~\nu}$ $
Show moreConsider $N$ times to repair, $T\sb1,T\sb2\cdots,T\sb{N}$, from a repair time distribution function $F(\cdot)$. Let $p\sb{0~1},p\sb{0~2},\cdots,p\sb{0~K}$ be $K$ proportions with $\sum\sbsp{\nu =1}{K}p\sb{0~\nu}$ $<$ 1. We wish to have at least 100 ($\sum\sbsp{\nu =1}{K}p\sb{0~\nu}$)% of items repaired by time $L\sb{i}$, $1 \le i \le K$, $K \ge 2$. Denote the unknown quantity $F(L\sb{i}$)  $F(L\sb{i1})$ as $p\sb{i}$, $1 \le i \le K$. Thus we wish to test the hypothesis(UNFORMATTED TABLE OR EQUATION FOLLOWS), A simple procedure is to test this hypothesis with the $K$ statistics $N\sb1$, $\sum\sbsp{\nu=1}{2}N\sb{\nu},\cdots,\sum\sbsp{\nu=a}{K}N\sb{\nu}$, where $\sum\sbsp{\nu=1}{i}N\sb{\nu}$ = the number of repairs that takes place on or before $l\sb{i}$, $1 \le i \le K$. Each $\sum\sbsp{\nu=n}{i}N\sb{\nu}$ is a binomial random variable with unknown parameter $\sum\sbsp{\nu=1}{i}p\sb{\nu}$. The hypothesis H$\sb0$ is rejected if any of the $\sum\sbsp{\nu=1}{i}N\sb{\nu}$ $\le$ $n\sbsp{i}{0}$, where the $n\sbsp{i}{0}$ are chosen from binomial tables. This test is shown to have several deficiencies. We construct an alternative procedure with which to test this hypothesis., The Generalized Likelihood Ratio Statistic (GLRT) is based on the multinomial random variable ($N\sb1,N\sb2,\cdots,N\sb{K}$), with parameter ${(p\sb1,}$ $p\sb2,\cdots,$ $p\sb{K}$). The parameter space is(UNFORMATTED TABLE OR EQUATION FOLLOWS), An algorithm is constructed and computer code supplied to calculate $\lambda(N)$ efficiently for any finite $N$., For small samples computer code is given to calculate exactly $\delta$ or a pvalue for an observed value of $\lambda(N(K))$, 2 $\le$ $K$ $\le$ 5, and $K\ \le\ N\ \le\ N(K)$., For large $N$, we apply a theorem by Feder(1968) to evaluate the asymptotic critical values and power., The GLRT statistic, $\lambda(N)$, is shown to be approximately a unionintersection test and thus is approximated by a collection of uniformly most powerful unbiased tests of binomial parameters. The GLRT is shown empirically in the case of $K$ = 3 to have higher power than competing unionintersection tests., Two power estimation techniques are described and compared empirically., References. Feder, Paul J. (1968), "On the distribution of the loglikelihood ratio test statistic when the true parameter is 'near' the boundaries of the hypothesis region," Annals of Mathematical Statistics, 39, 20442055.
Show less  Date Issued
 1988, 1988
 Identifier
 AAI8822443, 3161637, FSDT3161637, fsu:77837
 Format
 Document (PDF)
 Title
 TWOWAY CLUSTER ANALYSIS WITH NOMINAL DATA.
 Creator

COOPER, PAUL GAYLORD., Florida State University
 Abstract/Description

Consider an M by N data matrix X whose elements may assume values 0, 1, 2, . . ., H. Denote the rows of X by (alpha)(,1), (alpha)(,2), . . ., (alpha)(,M). A tree on the rows of X is a sequence of distinct partitions {P(,1)}(,i=1) such that: (a) P(,1) = {((alpha)(,1)), . . ., ((alpha)(,M))}, (b) P(,i) is a refinement of P(,i+1) for i = 1, . . ., k1, and (c) P(,k) = {((alpha)(,1), . . ., (alpha)(,M))}. The twoway clustering problem consists of simultaneously constructing trees on the rows,...
Show moreConsider an M by N data matrix X whose elements may assume values 0, 1, 2, . . ., H. Denote the rows of X by (alpha)(,1), (alpha)(,2), . . ., (alpha)(,M). A tree on the rows of X is a sequence of distinct partitions {P(,1)}(,i=1) such that: (a) P(,1) = {((alpha)(,1)), . . ., ((alpha)(,M))}, (b) P(,i) is a refinement of P(,i+1) for i = 1, . . ., k1, and (c) P(,k) = {((alpha)(,1), . . ., (alpha)(,M))}. The twoway clustering problem consists of simultaneously constructing trees on the rows, columns, and elements of X. A generalization of a twoway joining algorithm (TWJA) introduced by J. A. Hartigan (1975) is used to construct the three trees., The TWJA requires the definition of measures of dissimilarity between row clusters and column clusters respectively. Two approaches are used in the construction of these dissimilarity coefficientsone based on intuition and one based on a formal prediction model. For matrices with binary elements (0 or 1), measures of dissimilarity between row or column clusters are based on the number of mismatching pairs. Consider two distinct row clusters R(,p) and R(,q) containing m(,p) and m(,q) rows respectively. One measure of dissimilarity, d(,0)(R(,p), R(,q)), between R(,p) and R(,q), is, (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI), where b(,p(beta)) and b(,q(beta)) are the number of ones in column (beta) of clusters R(,p) and R(,q) respectively. Two additional intuitive dissimilarity coefficients are also defined and studied., For matrices containing nominal level data, dissimilarity coefficients are based on a formal prediction model. Analogous to the procedure of Cleveland and Relles (1974), for a given data matrix, the model consists of a scheme for random selection of two rows (or columns) from the matrix and an identification rule for distinguishing between the two rows (or columns). A loss structure is defined for both rows and columns and the expected loss due to incorrect row or column identification is computed. The dissimilarity between two (say) row clusters is then defined to be the increase in expected loss due to joining those two row clusters into a single cluster., Stopping criteria are suggested for both the intuitive and prediction model approaches. For the intuitive approach, it is suggested that joining be stopped when the dissimilarity between the (say) row clusters to be joined next exceeds that expected by chance under the assumption that the (say) column totals of the matrix are fixed. For the prediction model approach the stopping criterion is based on a cluster prediction model in which the objective is to distinguish between row or column clusters. A cluster identification rule is defined based on the information in the partitioned data matrix and the expected loss due to incorrect cluster identification is computed. The expected cluster loss is also computed when cluster identification is based on strict randomization. The relative decrease in expected cluster loss due to identification based on the partitioned matrix versus that based on randomization is suggested as a stopping criterion., Both contrived and real data examples are used to illustrate and compare the two clustering procedures. Computational aspects of the procedure are discussed and it is concluded that the intuitive approach is less costly in terms of computation time. Further, five admissibility properties are defined and, for certain intuitive dissimilarity coefficients, the trees produced by the TWJA are shown to possess three of the five properties.
Show less  Date Issued
 1980, 1980
 Identifier
 AAI8026123, 3084693, FSDT3084693, fsu:74194
 Format
 Document (PDF)
 Title
 STOCHASTIC VERSIONS OF REARRANGEMENT INEQUALITIES WITH APPLICATIONS TO STATISTICS.
 Creator

D'ABADIE, CATHERINE ANNE., Florida State University
 Abstract/Description

In this dissertation we develop a theory which offers a unified approach to the problem of obtaining stochastic versions of deterministic rearrangement inequalities., To develop the theory we first define two new classes of functions and establish preservation properties of these functions under various statistical and mathematical operations., Next we introduce the notion of stochastically similarly arranged (SSA) pairs of random vectors. We prove that if the random vectors (X,Y) are SSA and...
Show moreIn this dissertation we develop a theory which offers a unified approach to the problem of obtaining stochastic versions of deterministic rearrangement inequalities., To develop the theory we first define two new classes of functions and establish preservation properties of these functions under various statistical and mathematical operations., Next we introduce the notion of stochastically similarly arranged (SSA) pairs of random vectors. We prove that if the random vectors (X,Y) are SSA and the function f from R('n) x R('n) into R('n) is monotone with respect to a certain partial ordering on R('n) x R('n) then for every permutation (pi) the stochastic inequalities, (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI), hold. This result yields a unified way of obtaining stochastic versions of rearrangement inequalities., We then show that many multivariate densities of interest in statistical practice govern pairs of random vectors which are SSA., Next we show that under certain statistical operations on pairs of SSA random vectors the property of being SSA is preserved. For example, we show that the rank order of SSA random variables is SSA. We also show that the SSA property is preserved under certain contamination models., Finally, we show how the results we obtain can be applied to problems in hypothesis testing.
Show less  Date Issued
 1981, 1981
 Identifier
 AAI8205717, 3085181, FSDT3085181, fsu:74676
 Format
 Document (PDF)
 Title
 ESTIMATING MULTIDIMENSIONAL TABLES FROM SURVEY DATA: PREDICTING MAGAZINE AUDIENCES.
 Creator

DANAHER, PETER JOSEPH., Florida State University
 Abstract/Description

Suppose an advertiser constructs an advertising campaign by placing k advertisements in a magazine. He now estimates the proportion of the population which sees none, one, or up to all k advertisements (called the exposure distribution). Several criteria for evaluating the effectiveness of the campaign can be obtained directly from the exposure distribution. Two of them are reach, the proportion of the population which is exposed to at least one of the advertisements and effective reach, the...
Show moreSuppose an advertiser constructs an advertising campaign by placing k advertisements in a magazine. He now estimates the proportion of the population which sees none, one, or up to all k advertisements (called the exposure distribution). Several criteria for evaluating the effectiveness of the campaign can be obtained directly from the exposure distribution. Two of them are reach, the proportion of the population which is exposed to at least one of the advertisements and effective reach, the mean of the exposure distribution., We develop three exposure distribution models for the cases where advertising campaigns are comprised of one, two, or three or more magazines. The models build on each other in that the model for one magazine is used to improve the fit of the model for two magazines and the model for two magazines is used to estimate the parameters of the model for three or more magazines., A thorough empirical test, using the AGB:McNair "National Media Survey", shows that each of our models outperforms the best currentlyavailable models. In addition, the three models are proved to have optimal asymptotic properties., The models are used to select a media schedule which maximizes either reach or effective reach subject to a budget constraint. A monotonicity property of reach and effective reach yields an algorithm for optimizing both reach and effective reach that greatly reduces computation time over conventional methods used to solve integer programming problems., It is more useful to estimate the proportion of the population which sees the advertisements in a magazine rather than the proportion which sees the magazine. Often, however, no advertisement recall data is available so we are forced to estimate the proportion which is exposed to just the magazines. If advertisement recall data is available we give a natural and simple adjustment of the original magazine exposure data to get advertisement exposure data. Our models also give an excellent fit to these adjusted exposure data.
Show less  Date Issued
 1987, 1987
 Identifier
 AAI8721837, 3086665, FSDT3086665, fsu:76140
 Format
 Document (PDF)
 Title
 Ultrafast Lattice Dynamics in Metal Thin Films and NanoParticles.
 Creator

Wang, Xuan, Cao, Jim, Yang, Wei, Bonesteel, Nicholas, Riley, Mark, Xiong, Peng, Department of Physics, Florida State University
 Abstract/Description

This thesis presents the new development of the 3rd generation femtosecond diffractometer (FED) in Professor Jim Cao's group and its application to study ultrafast structural dynamics of solid state materials. The 3rd generation FED prevails its former type and other similar FED instruments by a DC electron gun that can generate much higher energy electron pulses, and a more efficient imaging system. This combination together with miscellaneous improvements significantly boosts the signalto...
Show moreThis thesis presents the new development of the 3rd generation femtosecond diffractometer (FED) in Professor Jim Cao's group and its application to study ultrafast structural dynamics of solid state materials. The 3rd generation FED prevails its former type and other similar FED instruments by a DC electron gun that can generate much higher energy electron pulses, and a more efficient imaging system. This combination together with miscellaneous improvements significantly boosts the signaltonoise ratio and thus enables us to study more complex solid state materials. Two main thrusts are discussed in details in this thesis. The first one is the dynamics of coherent phonon generation by ultrafast heating in gold thin film and nanoparticles, which emphasizes the electronic thermal stress. The other one is the ultrafast dynamics in Nickel, which shows that the mutual interactions among lattice, spin and electron subsystems can significantly alter the ultrafast lattice dynamics. In these studies, we exploit the advantage of FED instrument as an ideal tool that can directly and simultaneously monitor the coherent and random motion of lattice.
Show less  Date Issued
 2010
 Identifier
 FSU_migr_etd1247
 Format
 Thesis
 Title
 TimeVarying Coefficient Models with ARMAGARCH Structures for Longitudinal Data Analysis.
 Creator

Zhao, Haiyan, Niu, Xufeng, Huﬀer, Fred, Nolder, Craig, McGee, Dan, Department of Statistics, Florida State University
 Abstract/Description

The motivation of my research comes from the analysis of the Framingham Heart Study (FHS) data. The FHS is a long term prospective study of cardiovascular disease in the community of Framingham, Massachusetts. The study began in 1948 and 5,209 subjects were initially enrolled. Examinations were given biennially to the study participants and their status associated with the occurrence of disease was recorded. In this dissertation, the event we are interested in is the incidence of the coronary...
Show moreThe motivation of my research comes from the analysis of the Framingham Heart Study (FHS) data. The FHS is a long term prospective study of cardiovascular disease in the community of Framingham, Massachusetts. The study began in 1948 and 5,209 subjects were initially enrolled. Examinations were given biennially to the study participants and their status associated with the occurrence of disease was recorded. In this dissertation, the event we are interested in is the incidence of the coronary heart disease (CHD). Covariates considered include sex, age, cigarettes per day (CSM), serum cholesterol (SCL), systolic blood pressure (SBP) and body mass index (BMI, weight in kilograms/height in meters squared). Statistical literature review indicates that effects of the covariates on Cardiovascular disease or death caused by all possible diseases in the Framingham study change over time. For example, the effect of SCL on Cardiovascular disease decreases linearly over time. In this study, I would like to examine the timevarying effects of the risk factors on CHD incidence. Timevarying coefficient models with ARMAGARCH structure are developed in this research. The maximum likelihood and the marginal likelihood methods are used to estimate the parameters in the proposed models. Since highdimensional integrals are involved in the calculations of the marginal likelihood, the Laplace approximation is employed in this study. Simulation studies are conducted to evaluate the performance of these two estimation methods based on our proposed models. The KullbackLeibler (KL) divergence and the root mean square error are employed in the simulation studies to compare the results obtained from different methods. Simulation results show that the marginal likelihood approach gives more accurate parameter estimates, but is more computationally intensive. Following the simulation study, our proposed models are applied to the Framingham Heart Study to investigate the timevarying effects of covariates with respect to CHD incidence. To specify the timeseries structures of the effects of risk factors, the Bayesian Information Criterion (BIC) is used for model selection. Our study shows that the relationship between CHD and risk factors changes over time. For males, there is an obviously decreasing linear trend for age effect, which implies that the age effect on CHD is less significant for elder patients than younger patients. The effect of CSM stays almost the same in the first 30 years and decreases thereafter. There are slightly decreasing linear trends for both effects of SBP and BMI. Furthermore, the coefficients of SBP are mostly positive over time, i.e., patients with higher SBP are more likely developing CHD as expected. For females, there is also an obviously decreasing linear trend for age effect, while the effects of SBP and BMI on CHD are mostly positive and do not change too much over time.
Show less  Date Issued
 2010
 Identifier
 FSU_migr_etd0527
 Format
 Thesis
 Title
 A Comparison of Estimators in Hierarchical Linear Modeling: Restricted Maximum Likelihood versus Bootstrap via Minimum Norm Quadratic Unbiased Estimators.
 Creator

Delpish, Ayesha Nneka, Niu, XuFeng, Tate, Richard L., Huﬀer, Fred W., Zahn, Douglas, Department of Statistics, Florida State University
 Abstract/Description

The purpose of the study was to investigate the relative performance of two estimation procedures, the restricted maximum likelihood (REML) and the bootstrap via MINQUE, for a twolevel hierarchical linear model under a variety of conditions. Specific focus lay on observing whether the bootstrap via MINQUE procedure offered improved accuracy in the estimation of the model parameters and their standard errors in situations where normality may not be guaranteed. Through Monte Carlo simulations,...
Show moreThe purpose of the study was to investigate the relative performance of two estimation procedures, the restricted maximum likelihood (REML) and the bootstrap via MINQUE, for a twolevel hierarchical linear model under a variety of conditions. Specific focus lay on observing whether the bootstrap via MINQUE procedure offered improved accuracy in the estimation of the model parameters and their standard errors in situations where normality may not be guaranteed. Through Monte Carlo simulations, the importance of this assumption for the accuracy of multilevel parameter estimates and their standard errors was assessed using the accuracy index of relative bias and by observing the coverage percentages of 95% confidence intervals constructed for both estimation procedures. The study systematically varied the number of groups at level2 (30 versus 100), the size of the intraclass correlation (0.01 versus 0.20) and the distribution of the observations (normal versus chisquared with 1 degree of freedom). The number of groups and intraclass correlation factors produced effects consistent with those previously reported—as the number of groups increased, the bias in the parameter estimates decreased, with a more significant effect observed for those estimates obtained via REML. High levels of the intraclass correlation also led to a decrease in the efficiency of parameter estimation under both methods. Study results show that while both the restricted maximum likelihood and the bootstrap via MINQUE estimates of the fixed effects were accurate, the efficiency of the estimates was affected by the distribution of errors with the bootstrap via MINQUE procedure outperforming the REML. Both procedures produced less efficient estimators under the chisquared distribution, particularly for the variancecovariance component estimates.
Show less  Date Issued
 2006
 Identifier
 FSU_migr_etd0771
 Format
 Thesis
 Title
 Estimation from Data Representing a Sample of Curves.
 Creator

Auguste, Anna L., Bunea, Florentina, Mason, Patrick, Hollander, Myles, Huﬀer, Fred, Department of Statistics, Florida State University
 Abstract/Description

This dissertation introduces and assesses an algorithm to generate confidence bands for a regression function or a main effect when multiple data sets are available. In particular it proposes to construct confidence bands for different trajectories and then aggregate these to produce an overall confidence band for a mean function. An estimator of the regression function or main effect is also examined. First, nonparametric estimators and confidence bands are formed on each data set separately...
Show moreThis dissertation introduces and assesses an algorithm to generate confidence bands for a regression function or a main effect when multiple data sets are available. In particular it proposes to construct confidence bands for different trajectories and then aggregate these to produce an overall confidence band for a mean function. An estimator of the regression function or main effect is also examined. First, nonparametric estimators and confidence bands are formed on each data set separately. Then each data set is in turn treated as a testing set for aggregating the preliminary results from the remaining data sets. The criterion used for this aggregation is either the least squares (LS) criterion or a BIC type penalized LS criterion. The proposed estimator is the average over data sets of these aggregates. It is thus a weighted sum of the preliminary estimators. The proposed confidence band is the minimum L1 band of all the M aggregate bands when we only have a main effect. In the case where there is some random effect we suggest an adjustment to the confidence band. In this case, the proposed confidence band is the minimum L1 band of all the M adjusted aggregate bands. Desirable asymptotic properties are shown to hold. A simulation study examines the performance of each technique relative to several alternate methods and theoretical benchmarks. An application to seismic data is conducted.
Show less  Date Issued
 2006
 Identifier
 FSU_migr_etd0286
 Format
 Thesis
 Title
 Statistical Shape Analysis on Manifolds with Applications to Planar Contours and Structural Proteomics.
 Creator

Ellingson, Leif A., Patrangenaru, Vic, Mio, Washington, Zhang, Jinfeng, Niu, Xufeng, Department of Statistics, Florida State University
 Abstract/Description

The technological advances in recent years have produced a wealth of intricate digital imaging data that is analyzed effectively using the principles of shape analysis. Such data often lies on either highdimensional or infinitedimensional manifolds. With computing power also now strong enough to handle this data, it is necessary to develop theoreticallysound methodology to perform the analysis in a computationally efficient manner. In this dissertation, we propose approaches of doing so...
Show moreThe technological advances in recent years have produced a wealth of intricate digital imaging data that is analyzed effectively using the principles of shape analysis. Such data often lies on either highdimensional or infinitedimensional manifolds. With computing power also now strong enough to handle this data, it is necessary to develop theoreticallysound methodology to perform the analysis in a computationally efficient manner. In this dissertation, we propose approaches of doing so for planar contours and the threedimensional atomic structures of protein binding sites. First, we adapt Kendall's definition of direct similarity shapes of finite planar configurations to shapes of planar contours under certain regularity conditions and utilize Ziezold's nonparametric view of Frechet mean shapes. The space of direct similarity shapes of regular planar contours is embedded in a space of HilbertSchmidt operators in order to obtain the VeroneseWhitney extrinsic mean shape. For computations, it is necessary to use discrete approximations of both the contours and the embedding. For cases when landmarks are not provided, we propose an automated, randomized landmark selection procedure that is useful for contour matching within a population and is consistent with the underlying asymptotic theory. For inference on the extrinsic mean direct similarity shape, we consider a onesample neighborhood hypothesis test and the use of nonparametric bootstrap to approximate confidence regions. Bandulasiri et al (2008) suggested using extrinsic reflection sizeandshape analysis to study the relationship between the structure and function of protein binding sites. In order to obtain meaningful results for this approach, it is necessary to identify the atoms common to a group of binding sites with similar functions and obtain proper correspondences for these atoms. We explore this problem in depth and propose an algorithm for simultaneously finding the common atoms and their respective correspondences based upon the Iterative Closest Point algorithm. For a benchmark data set, our classification results compare favorably with those of leading established methods. Finally, we discuss current directions in the field of statistics on manifolds, including a computational comparison of intrinsic and extrinsic analysis for various applications and a brief introduction of sample spaces with manifold stratification.
Show less  Date Issued
 2011
 Identifier
 FSU_migr_etd0053
 Format
 Thesis
 Title
 Individual PatientLevel Data MetaAnalysis: A Comparison of Methods for the Diverse Populations Collaboration Data Set.
 Creator

Dutton, Matthew Thomas, McGee, Daniel, Becker, Betsy, Niu, Xufeng, Zhang, Jinfeng, Department of Statistics, Florida State University
 Abstract/Description

DerSimonian and Laird define metaanalysis as "the statistical analysis of a collection of analytic results for the purpose of integrating their findings. One alternative to classical metaanalytic approaches in known as Individual PatientLevel Data, or IPD, metaanalysis. Rather than depending on summary statistics calculated for individual studies, IPD metaanalysis analyzes the complete data from all included studies. Two potential approaches to incorporating IPD data into the meta...
Show moreDerSimonian and Laird define metaanalysis as "the statistical analysis of a collection of analytic results for the purpose of integrating their findings. One alternative to classical metaanalytic approaches in known as Individual PatientLevel Data, or IPD, metaanalysis. Rather than depending on summary statistics calculated for individual studies, IPD metaanalysis analyzes the complete data from all included studies. Two potential approaches to incorporating IPD data into the metaanalytic framework are investigated. A twostage analysis is first conducted, in which individual models are fit for each study and summarized using classical metaanalysis procedures. Secondly, a onestage approach that singularly models the data and summarizes the information across studies is investigated. Data from the Diverse Populations Collaboration data set are used to investigate the differences between these two methods in a specific example. The bootstrap procedure is used to determine if the two methods produce statistically different results in the DPC example. Finally, a simulation study is conducted to investigate the accuracy of each method in given scenarios.
Show less  Date Issued
 2011
 Identifier
 FSU_migr_etd0620
 Format
 Thesis
 Title
 Minimax Tests for Nonparametric Alternatives with Applications to High Frequency Data.
 Creator

Yu, Han, Song, KaiSheng, Professor, Jack Quine, Professor, Fred Huﬀer, Professor, Dan McGee, Department of Statistics, Florida State University
 Abstract/Description

We present a general methodology for developing an asymptotically distributionfree, asymptotic minimax tests. The tests are constructed via a nonparametric densityquantile function and the limiting distribution is derived by a martingale approach. The procedure can be viewed as a novel parametric extension of the classical parametric likelihood ratio test. The proposed tests are shown to be omnibus within an extremely large class of nonparametric global alternatives characterized by simple...
Show moreWe present a general methodology for developing an asymptotically distributionfree, asymptotic minimax tests. The tests are constructed via a nonparametric densityquantile function and the limiting distribution is derived by a martingale approach. The procedure can be viewed as a novel parametric extension of the classical parametric likelihood ratio test. The proposed tests are shown to be omnibus within an extremely large class of nonparametric global alternatives characterized by simple conditions. Furthermore, we establish that the proposed tests provide better minimax distinguishability. The tests have much greater power for detecting highfrequency nonparametric alternatives than the existing classical tests such as KolmogorovSmirnov and Cramervon Mises tests. The good performance of the proposed tests is demonstrated by Monte Carlo simulations and applications in High Energy Physics.
Show less  Date Issued
 2006
 Identifier
 FSU_migr_etd0796
 Format
 Thesis
 Title
 AP Student Visual Preferences for Problem Solving.
 Creator

Swoyer, Liesl, Department of Statistics
 Abstract/Description

The purpose of this study is to explore the mathematical preference of high school AP Calculus students by examining their tendencies for using differing methods of thought. A student's preferred mode of thinking was measured on a scale ranging from a preference for analytical thought to a preference for visual thought as they completed derivative and antiderivative tasks presented both algebraically and graphically. This relates to previous studies by continuing to analyze the factors that...
Show moreThe purpose of this study is to explore the mathematical preference of high school AP Calculus students by examining their tendencies for using differing methods of thought. A student's preferred mode of thinking was measured on a scale ranging from a preference for analytical thought to a preference for visual thought as they completed derivative and antiderivative tasks presented both algebraically and graphically. This relates to previous studies by continuing to analyze the factors that have been found to mediate the students' performance and preference in regards to a variety of calculus tasks. Data was collected by Dr. Erhan Haciomeroglu at the University of Central Florida. Students' preferences were not affected by gender. Students were found to approach graphical and algebraic tasks similarly, without any significant change with regards to derivative or antiderivative nature of the tasks. Highly analytic and highly visual students revealed the same proportion of change in visuality as harmonic students when more difficult calculus tasks were encountered. Thus, a strong preference for visual thinking when completing algebraic tasks was not the determining factor of their preferred method of thinking when approaching graphical tasks.
Show less  Date Issued
 2012
 Identifier
 FSU_migr_uhm0052
 Format
 Thesis
 Title
 Age Effects in the Extinction of Planktonic Foraminifera: A New Look at Van Valen's Red Queen Hypothesis.
 Creator

Wiltshire, Jelani, Huﬀer, Fred, Parker, William, Chicken, Eric, Sinha, Debajyoti, Department of Statistics, Florida State University
 Abstract/Description

Van Valen's Red Queen hypothesis states that within a homogeneous taxonomic group the age is statistically independent of the rate of extinction. The case of the Red Queen hypothesis being addressed here is when the homogeneous taxonomic group is a group of similar species. Since Van Valen's work, various statistical approaches have been used to address the relationship between taxon duration (age) and the rate of extinction. Some of the more recent approaches to this problem using Planktonic...
Show moreVan Valen's Red Queen hypothesis states that within a homogeneous taxonomic group the age is statistically independent of the rate of extinction. The case of the Red Queen hypothesis being addressed here is when the homogeneous taxonomic group is a group of similar species. Since Van Valen's work, various statistical approaches have been used to address the relationship between taxon duration (age) and the rate of extinction. Some of the more recent approaches to this problem using Planktonic Foraminifera (Foram) extinction data include Weibull and Exponential modeling (Parker and Arnold, 1997), and Cox proportional hazards modeling (Doran et al. 2004,2006). I propose a general class of test statistics that can be used to test for the effect of age on extinction. These test statistics allow for a varying background rate of extinction and attempt to remove the effects of other covariates when assessing the effect of age on extinction. No model is assumed for the covariate effects. Instead I control for covariate effects by pairing or grouping together similar species. I use simulated data sets to compare the power of the statistics. In applying the test statistics to the Foram data, I have found age to have a positive effect on extinction.
Show less  Date Issued
 2010
 Identifier
 FSU_migr_etd0952
 Format
 Thesis
 Title
 Inference for Semiparametric TimeVarying Covariate Effect Relative Risk Regression Models.
 Creator

Ye, Gang, McKeague, Ian W., Wang, Xiaoming, Huffer, Fred W., Song, KaiSheng, Department of Statistics, Florida State University
 Abstract/Description

A major interest of survival analysis is to assess covariate effects on survival via appropriate conditional hazard function regression models. The Cox proportional hazards model, which assumes an exponential form for the relative risk, has been a popular choice. However, other regression forms such as Aalen's additive risk model may be more appropriate in some applications. In addition, covariate effects may depend on time, which can not be reflected by a Cox proportional hazards model. In...
Show moreA major interest of survival analysis is to assess covariate effects on survival via appropriate conditional hazard function regression models. The Cox proportional hazards model, which assumes an exponential form for the relative risk, has been a popular choice. However, other regression forms such as Aalen's additive risk model may be more appropriate in some applications. In addition, covariate effects may depend on time, which can not be reflected by a Cox proportional hazards model. In this dissertation, we study a class of timevarying covariate effect regression models in which the link function (relative risk function) is a twice continuously differentiable and prespecified, but otherwise general given function. This is a natural extension of the PrenticeSelf model, in which the link function is general but covariate effects are modelled to be time invariant. In the first part of the dissertation, we focus on estimating the cumulative or integrated covariate effects. The standard martingale approach based on counting processes is utilized to derive a likelihoodbased iterating equation. An estimator for the cumulative covariate effect that is generated from the iterating equation is shown to be ¡Ìnconsistent. Asymptotic normality of the estimator is also demonstrated. Another aspect of the dissertation is to investigate a new test for the above timevarying covariate effect regression model and study consistency of the test based on martingale residuals. For Aalen's additive risk model, we introduce a test statistic based on the HufferMcKeague weightedleastsquares estimator and show its consistency against some alternatives. An alternative way to construct a test statistic based on Bayesian Bootstrap simulation is introduced. An application to real lifetime data will be presented.
Show less  Date Issued
 2005
 Identifier
 FSU_migr_etd0949
 Format
 Thesis
 Title
 Transformation Models for Survival Data Analysis and Applications.
 Creator

Liu, Yang, Niu, XuFeng, Lloyd, Donald, McGee, Dan, Sinha, Debajyoti, Department of Statistics, Florida State University
 Abstract/Description

It is often assumed that all uncensored subjects will eventually experience the event of interest in standard survival models. However, in some situations when the event considered is not death, it will never occur for a proportion of subjects. Survival models with a cure fraction are becoming popular in analyzing this type of study. We propose a generalized transformation model motivated by Zeng et al's (2006) transformed proportional time cure model. In our proposed model, fractional...
Show moreIt is often assumed that all uncensored subjects will eventually experience the event of interest in standard survival models. However, in some situations when the event considered is not death, it will never occur for a proportion of subjects. Survival models with a cure fraction are becoming popular in analyzing this type of study. We propose a generalized transformation model motivated by Zeng et al's (2006) transformed proportional time cure model. In our proposed model, fractional polynomials are used instead of the simple linear combination of the covariates. The proposed models give us more flexibility without loosing any good properties of the original model, such as asymptotic consistency and asymptotic normality of the regression coefficients. The proposed model will better fit the data where the relationship between a response variable and covariates is nonlinear. We also provide a power selection procedure based on the likelihood function. A simulation study is carried out to show the accuracy of the proposed power selection procedure. The proposed models are applied to coronary heart disease and cancer related medical data from both observational cohort studies and clinical trials
Show less  Date Issued
 2009
 Identifier
 FSU_migr_etd1155
 Format
 Thesis