Current Search: Algorithms (x) » Shireman, Emilie M (x)
Search results
- Title
- Generalized Ensemble Sampling of Enzyme Reaction Free Energy Pathways.
- Creator
-
Wu, D, Fajer, M I, Cao, L, Cheng, X, Yang, W
- Abstract/Description
-
Free energy path sampling plays an essential role in computational understanding of chemical reactions, particularly those occurring in enzymatic environments. Among a variety of molecular dynamics simulation approaches, the generalized ensemble sampling strategy is uniquely attractive for the fact that it not only can enhance the sampling of rare chemical events but also can naturally ensure consistent exploration of environmental degrees of freedom. In this review, we plan to provide a...
Show moreFree energy path sampling plays an essential role in computational understanding of chemical reactions, particularly those occurring in enzymatic environments. Among a variety of molecular dynamics simulation approaches, the generalized ensemble sampling strategy is uniquely attractive for the fact that it not only can enhance the sampling of rare chemical events but also can naturally ensure consistent exploration of environmental degrees of freedom. In this review, we plan to provide a tutorial-like tour on an emerging topic: generalized ensemble sampling of enzyme reaction free energy path. The discussion is largely focused on our own studies, particularly ones based on the metadynamics free energy sampling method and the on-the-path random walk path sampling method. We hope that this minipresentation will provide interested practitioners some meaningful guidance for future algorithm formulation and application study.
Show less - Date Issued
- 2016-01-01
- Identifier
- FSU_pmch_27498634, 10.1016/bs.mie.2016.05.012, PMC4978182, 27498634, 27498634, S0076-6879(16)30047-7
- Format
- Citation
- Title
- A simulated annealing heuristic for maximum correlation core/periphery partitioning of binary networks.
- Creator
-
Brusco, Michael, Stolze, Hannah J, Hoffman, Michaela, Steinley, Douglas
- Abstract/Description
-
A popular objective criterion for partitioning a set of actors into core and periphery subsets is the maximization of the correlation between an ideal and observed structure associated with intra-core and intra-periphery ties. The resulting optimization problem has commonly been tackled using heuristic procedures such as relocation algorithms, genetic algorithms, and simulated annealing. In this paper, we present a computationally efficient simulated annealing algorithm for maximum...
Show moreA popular objective criterion for partitioning a set of actors into core and periphery subsets is the maximization of the correlation between an ideal and observed structure associated with intra-core and intra-periphery ties. The resulting optimization problem has commonly been tackled using heuristic procedures such as relocation algorithms, genetic algorithms, and simulated annealing. In this paper, we present a computationally efficient simulated annealing algorithm for maximum correlation core/periphery partitioning of binary networks. The algorithm is evaluated using simulated networks consisting of up to 2000 actors and spanning a variety of densities for the intra-core, intra-periphery, and inter-core-periphery components of the network. Core/periphery analyses of problem solving, trust, and information sharing networks for the frontline employees and managers of a consumer packaged goods manufacturer are provided to illustrate the use of the model.
Show less - Date Issued
- 2017-05-09
- Identifier
- FSU_pmch_28486475, 10.1371/journal.pone.0170448, PMC5423590, 28486475, 28486475, PONE-D-16-02050
- Format
- Citation
- Title
- An application of the elastic net for an endophenotype analysis.
- Creator
-
Palejev, Dean, Hwang, Wookyeon, Landi, Nicole, Eastman, Maria, Frost, Stephen J, Fulbright, Robert K, Kidd, Judith R, Kidd, Kenneth K, Mason, Graeme F, Mencl, W Einar, Yrigollen...
Show morePalejev, Dean, Hwang, Wookyeon, Landi, Nicole, Eastman, Maria, Frost, Stephen J, Fulbright, Robert K, Kidd, Judith R, Kidd, Kenneth K, Mason, Graeme F, Mencl, W Einar, Yrigollen, Carolyn, Pugh, Kenneth R, Grigorenko, Elena L
Show less - Abstract/Description
-
We provide an illustration of an application of the elastic net to a large number of common genetic variants in the context of the search for the genetic bases of an endophenotype conceivably related to individual differences in learning. GABA concentration in the occipital cortex, a critical area for reading, was obtained in a group (n = 76) of children aged 6-10 years. Two extreme groups, high and low, were selected for genotyping with the 650Y Illumina array chip (Ilmn650Y). An elastic net...
Show moreWe provide an illustration of an application of the elastic net to a large number of common genetic variants in the context of the search for the genetic bases of an endophenotype conceivably related to individual differences in learning. GABA concentration in the occipital cortex, a critical area for reading, was obtained in a group (n = 76) of children aged 6-10 years. Two extreme groups, high and low, were selected for genotyping with the 650Y Illumina array chip (Ilmn650Y). An elastic net approach was applied to the resulting SNP dataset; 100 SNPs were identified for each chromosome as "interesting" based on having the highest absolute value coefficients. The analyses highlighted chromosomes 15 and 20, which contained 55 candidate genes. The STRING partner analyses of the associated proteins pointed to a number of related genes, most notably, GABA and NTRK receptors.
Show less - Date Issued
- 2011-01-01
- Identifier
- FSU_pmch_21229297, 10.1007/s10519-011-9443-8, PMC3613288, 21229297, 21229297
- Format
- Citation
- Title
- Examining the effect of initialization strategies on the performance of Gaussian mixture modeling.
- Creator
-
Shireman, Emilie, Steinley, Douglas, Brusco, Michael J
- Abstract/Description
-
Mixture modeling is a popular technique for identifying unobserved subpopulations (e.g., components) within a data set, with Gaussian (normal) mixture modeling being the form most widely used. Generally, the parameters of these Gaussian mixtures cannot be estimated in closed form, so estimates are typically obtained via an iterative process. The most common estimation procedure is maximum likelihood via the expectation-maximization (EM) algorithm. Like many approaches for identifying...
Show moreMixture modeling is a popular technique for identifying unobserved subpopulations (e.g., components) within a data set, with Gaussian (normal) mixture modeling being the form most widely used. Generally, the parameters of these Gaussian mixtures cannot be estimated in closed form, so estimates are typically obtained via an iterative process. The most common estimation procedure is maximum likelihood via the expectation-maximization (EM) algorithm. Like many approaches for identifying subpopulations, finite mixture modeling can suffer from locally optimal solutions, and the final parameter estimates are dependent on the initial starting values of the EM algorithm. Initial values have been shown to significantly impact the quality of the solution, and researchers have proposed several approaches for selecting the set of starting values. Five techniques for obtaining starting values that are implemented in popular software packages are compared. Their performances are assessed in terms of the following four measures: (1) the ability to find the best observed solution, (2) settling on a solution that classifies observations correctly, (3) the number of local solutions found by each technique, and (4) the speed at which the start values are obtained. On the basis of these results, a set of recommendations is provided to the user.
Show less - Date Issued
- 2017-02-01
- Identifier
- FSU_pmch_26721666, 10.3758/s13428-015-0697-6, PMC4930421, 26721666, 26721666, 10.3758/s13428-015-0697-6
- Format
- Citation
- Title
- Prediction of homoprotein and heteroprotein complexes by protein docking and template-based modeling: A CASP-CAPRI experiment..
- Creator
-
Lensink, Marc F, Velankar, Sameer, Kryshtafovych, Andriy, Huang, Shen-You, Schneidman-Duhovny, Dina, Sali, Andrej, Segura, Joan, Fernandez-Fuentes, Narcis, Viswanath, Shruthi,...
Show moreLensink, Marc F, Velankar, Sameer, Kryshtafovych, Andriy, Huang, Shen-You, Schneidman-Duhovny, Dina, Sali, Andrej, Segura, Joan, Fernandez-Fuentes, Narcis, Viswanath, Shruthi, Elber, Ron, Grudinin, Sergei, Popov, Petr, Neveu, Emilie, Lee, Hasup, Baek, Minkyung, Park, Sangwoo, Heo, Lim, Rie Lee, Gyu, Seok, Chaok, Qin, Sanbo, Zhou, Huan-Xiang, Ritchie, David W, Maigret, Bernard, Devignes, Marie-Dominique, Ghoorah, Anisah, Torchala, Mieczyslaw, Chaleil, Raphaël A G, Bates, Paul A, Ben-Zeev, Efrat, Eisenstein, Miriam, Negi, Surendra S, Weng, Zhiping, Vreven, Thom, Pierce, Brian G, Borrman, Tyler M, Yu, Jinchao, Ochsenbein, Françoise, Guerois, Raphaël, Vangone, Anna, Rodrigues, João P G L M, van Zundert, Gydo, Nellen, Mehdi, Xue, Li, Karaca, Ezgi, Melquiond, Adrien S J, Visscher, Koen, Kastritis, Panagiotis L, Bonvin, Alexandre M J J, Xu, Xianjin, Qiu, Liming, Yan, Chengfei, Li, Jilong, Ma, Zhiwei, Cheng, Jianlin, Zou, Xiaoqin, Shen, Yang, Peterson, Lenna X, Kim, Hyung-Rae, Roy, Amit, Han, Xusi, Esquivel-Rodriguez, Juan, Kihara, Daisuke, Yu, Xiaofeng, Bruce, Neil J, Fuller, Jonathan C, Wade, Rebecca C, Anishchenko, Ivan, Kundrotas, Petras J, Vakser, Ilya A, Imai, Kenichiro, Yamada, Kazunori, Oda, Toshiyuki, Nakamura, Tsukasa, Tomii, Kentaro, Pallara, Chiara, Romero-Durana, Miguel, Jiménez-García, Brian, Moal, Iain H, Férnandez-Recio, Juan, Joung, Jong Young, Kim, Jong Yun, Joo, Keehyoung, Lee, Jooyoung, Kozakov, Dima, Vajda, Sandor, Mottarella, Scott, Hall, David R, Beglov, Dmitri, Mamonov, Artem, Xia, Bing, Bohnuud, Tanggis, Del Carpio, Carlos A, Ichiishi, Eichiro, Marze, Nicholas, Kuroda, Daisuke, Roy Burman, Shourya S, Gray, Jeffrey J, Chermak, Edrisse, Cavallo, Luigi, Oliva, Romina, Tovchigrechko, Andrey, Wodak, Shoshana J
Show less - Abstract/Description
-
We present the results for CAPRI Round 30, the first joint CASP-CAPRI experiment, which brought together experts from the protein structure prediction and protein-protein docking communities. The Round comprised 25 targets from amongst those submitted for the CASP11 prediction experiment of 2014. The targets included mostly homodimers, a few homotetramers, and two heterodimers, and comprised protein chains that could readily be modeled using templates from the Protein Data Bank. On average 24...
Show moreWe present the results for CAPRI Round 30, the first joint CASP-CAPRI experiment, which brought together experts from the protein structure prediction and protein-protein docking communities. The Round comprised 25 targets from amongst those submitted for the CASP11 prediction experiment of 2014. The targets included mostly homodimers, a few homotetramers, and two heterodimers, and comprised protein chains that could readily be modeled using templates from the Protein Data Bank. On average 24 CAPRI groups and 7 CASP groups submitted docking predictions for each target, and 12 CAPRI groups per target participated in the CAPRI scoring experiment. In total more than 9500 models were assessed against the 3D structures of the corresponding target complexes. Results show that the prediction of homodimer assemblies by homology modeling techniques and docking calculations is quite successful for targets featuring large enough subunit interfaces to represent stable associations. Targets with ambiguous or inaccurate oligomeric state assignments, often featuring crystal contact-sized interfaces, represented a confounding factor. For those, a much poorer prediction performance was achieved, while nonetheless often providing helpful clues on the correct oligomeric state of the protein. The prediction performance was very poor for genuine tetrameric targets, where the inaccuracy of the homology-built subunit models and the smaller pair-wise interfaces severely limited the ability to derive the correct assembly mode. Our analysis also shows that docking procedures tend to perform better than standard homology modeling techniques and that highly accurate models of the protein components are not always required to identify their association modes with acceptable accuracy. Proteins 2016; 84(Suppl 1):323-348. © 2016 Wiley Periodicals, Inc.
Show less - Date Issued
- 2016-09-01
- Identifier
- FSU_pmch_27122118, 10.1002/prot.25007, PMC5030136, 27122118, 27122118
- Format
- Citation
- Title
- In vivo quantification of intraventricular hemorrhage in a neonatal piglet model using an EEG-layout based electrical impedance tomography array.
- Creator
-
Tang, Te, Weiss, Michael D, Borum, Peggy, Turovets, Sergei, Tucker, Don, Sadleir, Rosalind
- Abstract/Description
-
Intraventricular hemorrhage (IVH) is a common occurrence in the days immediately after premature birth. It has been correlated with outcomes such as periventricular leukomalacia (PVL), cerebral palsy and developmental delay. The causes and evolution of IVH are unclear; it has been associated with fluctuations in blood pressure, damage to the subventricular zone and seizures. At present, ultrasound is the most commonly used method for detection of IVH, but is used retrospectively. Without the...
Show moreIntraventricular hemorrhage (IVH) is a common occurrence in the days immediately after premature birth. It has been correlated with outcomes such as periventricular leukomalacia (PVL), cerebral palsy and developmental delay. The causes and evolution of IVH are unclear; it has been associated with fluctuations in blood pressure, damage to the subventricular zone and seizures. At present, ultrasound is the most commonly used method for detection of IVH, but is used retrospectively. Without the presence of adequate therapies to avert IVH, the use of a continuous monitoring technique may be somewhat moot. While treatments to mitigate the damage caused by IVH are still under development, the principal benefit of a continuous monitoring technique will be in investigations into the etiology of IVH, and its associations with periventricular injury and blood pressure fluctuations. Electrical impedance tomography (EIT) is potentially of use in this context as accumulating blood displaces higher conductivity cerebrospinal fluid (CSF) in the ventricles. We devised an electrode array and EIT measurement strategy that performed well in detection of simulated ventricular blood in computer models and phantom studies. In this study we describe results of pilot in vivo experiments on neonatal piglets, and show that EIT has high sensitivity and specificity to small quantities of blood (<1 ml) introduced into the ventricle. EIT images were processed to an index representing the quantity of accumulated blood (the 'quantity index', QI). We found that QI values were linearly related to fluid quantity, and that the slope of the curve was consistent between measurements on different subjects. Linear discriminant analysis showed a false positive rate of 0%, and receiver operator characteristic analysis found area under curve values greater than 0.98 to administered volumes between 0.5, and 2.0 ml. We believe our study indicates that this method may be well suited to quantitative monitoring of IVH in newborns, simultaneously or interleaved with electroencephalograph assessments.
Show less - Date Issued
- 2016-06-01
- Identifier
- FSU_pmch_27206102, 10.1088/0967-3334/37/6/751, PMC5333710, 27206102, 27206102
- Format
- Citation
- Title
- A Landmark-Free Method for Three-Dimensional Shape Analysis.
- Creator
-
Pomidor, Benjamin J, Makedonska, Jana, Slice, Dennis E
- Abstract/Description
-
The tools and techniques used in morphometrics have always aimed to transform the physical shape of an object into a concise set of numerical data for mathematical analysis. The advent of landmark-based morphometrics opened new avenues of research, but these methods are not without drawbacks. The time investment required of trained individuals to accurately landmark a data set is significant, and the reliance on readily-identifiable physical features can hamper research efforts. This is...
Show moreThe tools and techniques used in morphometrics have always aimed to transform the physical shape of an object into a concise set of numerical data for mathematical analysis. The advent of landmark-based morphometrics opened new avenues of research, but these methods are not without drawbacks. The time investment required of trained individuals to accurately landmark a data set is significant, and the reliance on readily-identifiable physical features can hamper research efforts. This is especially true of those investigating smooth or featureless surfaces. In this paper, we present a new method to perform this transformation for data obtained from high-resolution scanning technology. This method uses surface scans, instead of landmarks, to calculate a shape difference metric analogous to Procrustes distance and perform superimposition. This is accomplished by building upon and extending the Iterative Closest Point algorithm. We also explore some new ways this data can be used; for example, we can calculate an averaged surface directly and visualize point-wise shape information over this surface. Finally, we briefly demonstrate this method on a set of primate skulls and compare the results of the new methodology with traditional geometric morphometric analysis.
Show less - Date Issued
- 2016-03-08
- Identifier
- FSU_pmch_26953573, 10.1371/journal.pone.0150368, PMC4783062, 26953573, 26953573, PONE-D-15-18418
- Format
- Citation
- Title
- Self-organization in precipitation reactions far from the equilibrium.
- Creator
-
Nakouzi, Elias, Steinbock, Oliver
- Abstract/Description
-
Far from the thermodynamic equilibrium, many precipitation reactions create complex product structures with fascinating features caused by their unusual origins. Unlike the dissipative patterns in other self-organizing reactions, these features can be permanent, suggesting potential applications in materials science and engineering. We review four distinct classes of precipitation reactions, describe similarities and differences, and discuss related challenges for theoretical studies. These...
Show moreFar from the thermodynamic equilibrium, many precipitation reactions create complex product structures with fascinating features caused by their unusual origins. Unlike the dissipative patterns in other self-organizing reactions, these features can be permanent, suggesting potential applications in materials science and engineering. We review four distinct classes of precipitation reactions, describe similarities and differences, and discuss related challenges for theoretical studies. These classes are hollow micro- and macrotubes in chemical gardens, polycrystalline silica carbonate aggregates (biomorphs), Liesegang bands, and propagating precipitation-dissolution fronts. In many cases, these systems show intricate structural hierarchies that span from the nanometer scale into the macroscopic world. We summarize recent experimental progress that often involves growth under tightly regulated conditions by means of wet stamping, holographic heating, and controlled electric, magnetic, or pH perturbations. In this research field, progress requires mechanistic insights that cannot be derived from experiments alone. We discuss how mesoscopic aspects of the product structures can be modeled by reaction-transport equations and suggest important targets for future studies that should also include materials features at the nanoscale.
Show less - Date Issued
- 2016-08-19
- Identifier
- FSU_pmch_27551688, 10.1126/sciadv.1601144, PMC4991932, 27551688, 27551688, 1601144
- Format
- Citation
- Title
- A Comparison of Rule-based Analysis with Regression Methods in Understanding the Risk Factors for Study Withdrawal in a Pediatric Study.
- Creator
-
Haghighi, Mona, Johnson, Suzanne Bennett, Qian, Xiaoning, Lynch, Kristian F, Vehik, Kendra, Huang, Shuai
- Abstract/Description
-
Regression models are extensively used in many epidemiological studies to understand the linkage between specific outcomes of interest and their risk factors. However, regression models in general examine the average effects of the risk factors and ignore subgroups with different risk profiles. As a result, interventions are often geared towards the average member of the population, without consideration of the special health needs of different subgroups within the population. This paper...
Show moreRegression models are extensively used in many epidemiological studies to understand the linkage between specific outcomes of interest and their risk factors. However, regression models in general examine the average effects of the risk factors and ignore subgroups with different risk profiles. As a result, interventions are often geared towards the average member of the population, without consideration of the special health needs of different subgroups within the population. This paper demonstrates the value of using rule-based analysis methods that can identify subgroups with heterogeneous risk profiles in a population without imposing assumptions on the subgroups or method. The rules define the risk pattern of subsets of individuals by not only considering the interactions between the risk factors but also their ranges. We compared the rule-based analysis results with the results from a logistic regression model in The Environmental Determinants of Diabetes in the Young (TEDDY) study. Both methods detected a similar suite of risk factors, but the rule-based analysis was superior at detecting multiple interactions between the risk factors that characterize the subgroups. A further investigation of the particular characteristics of each subgroup may detect the special health needs of the subgroup and lead to tailored interventions.
Show less - Date Issued
- 2016-08-26
- Identifier
- FSU_pmch_27561809, 10.1038/srep30828, PMC5000469, 27561809, 27561809, srep30828
- Format
- Citation
- Title
- Preliminary Analysis of Difficulty of Importing Pattern-Based Concepts into the National Cancer Institute Thesaurus.
- Creator
-
He, Zhe, Geller, James
- Abstract/Description
-
Maintenance of biomedical ontologies is difficult. We have developed a pattern-based method for dealing with the problem of identifying missing concepts in the National Cancer Institute thesaurus (NCIt). Specifically, we are mining patterns connecting NCIt concepts with concepts in other ontologies to identify candidate missing concepts. However, the final decision about a concept insertion is always up to a human ontology curator. In this paper, we are estimating the difficulty of this task...
Show moreMaintenance of biomedical ontologies is difficult. We have developed a pattern-based method for dealing with the problem of identifying missing concepts in the National Cancer Institute thesaurus (NCIt). Specifically, we are mining patterns connecting NCIt concepts with concepts in other ontologies to identify candidate missing concepts. However, the final decision about a concept insertion is always up to a human ontology curator. In this paper, we are estimating the difficulty of this task for a domain expert by counting possible choices for a pattern-based insertion. We conclude that even with support of our mining algorithm, the insertion task is challenging.
Show less - Date Issued
- 2016-01-01
- Identifier
- FSU_pmch_27577410, PMC5785234, 27577410, 27577410
- Format
- Citation
- Title
- A comparison of latent class, K-means, and K-median methods for clustering dichotomous data.
- Creator
-
Brusco, Michael J, Shireman, Emilie, Steinley, Douglas
- Abstract/Description
-
The problem of partitioning a collection of objects based on their measurements on a set of dichotomous variables is a well-established problem in psychological research, with applications including clinical diagnosis, educational testing, cognitive categorization, and choice analysis. Latent class analysis and K-means clustering are popular methods for partitioning objects based on dichotomous measures in the psychological literature. The K-median clustering method has recently been touted...
Show moreThe problem of partitioning a collection of objects based on their measurements on a set of dichotomous variables is a well-established problem in psychological research, with applications including clinical diagnosis, educational testing, cognitive categorization, and choice analysis. Latent class analysis and K-means clustering are popular methods for partitioning objects based on dichotomous measures in the psychological literature. The K-median clustering method has recently been touted as a potentially useful tool for psychological data and might be preferable to its close neighbor, K-means, when the variable measures are dichotomous. We conducted simulation-based comparisons of the latent class, K-means, and K-median approaches for partitioning dichotomous data. Although all 3 methods proved capable of recovering cluster structure, K-median clustering yielded the best average performance, followed closely by latent class analysis. We also report results for the 3 methods within the context of an application to transitive reasoning data, in which it was found that the 3 approaches can exhibit profound differences when applied to real data. (PsycINFO Database Record
Show less - Date Issued
- 2017-09-01
- Identifier
- FSU_pmch_27607543, 10.1037/met0000095, PMC5982597, 27607543, 27607543, 2016-43141-001
- Format
- Citation
- Title
- GINOM: A statistical framework for assessing interval overlap of multiple genomic features..
- Creator
-
Bryner, Darshan, Criscione, Stephen, Leith, Andrew, Huynh, Quyen, Huffer, Fred, Neretti, Nicola
- Abstract/Description
-
A common problem in genomics is to test for associations between two or more genomic features, typically represented as intervals interspersed across the genome. Existing methodologies can test for significant pairwise associations between two genomic intervals; however, they cannot test for associations involving multiple sets of intervals. This limits our ability to uncover more complex, yet biologically important associations between multiple sets of genomic features. We introduce GINOM ...
Show moreA common problem in genomics is to test for associations between two or more genomic features, typically represented as intervals interspersed across the genome. Existing methodologies can test for significant pairwise associations between two genomic intervals; however, they cannot test for associations involving multiple sets of intervals. This limits our ability to uncover more complex, yet biologically important associations between multiple sets of genomic features. We introduce GINOM (Genomic INterval Overlap Model), a new method that enables testing of significant associations between multiple genomic features. We demonstrate GINOM's ability to identify higher-order associations with both simulated and real data. In particular, we used GINOM to explore L1 retrotransposable element insertion bias in lung cancer and found a significant pairwise association between L1 insertions and heterochromatic marks. Unlike other methods, GINOM also detected an association between L1 insertions and gene bodies marked by a facultative heterochromatic mark, which could explain the observed bias for L1 insertions towards cancer-associated genes.
Show less - Date Issued
- 2017-06-15
- Identifier
- FSU_pmch_28617797, 10.1371/journal.pcbi.1005586, PMC5491313, 28617797, 28617797, PCOMPBIOL-D-16-01322
- Format
- Citation
- Title
- Rate Constants and Mechanisms of Protein-Ligand Binding.
- Creator
-
Pang, Xiaodong, Zhou, Huan-Xiang
- Abstract/Description
-
Whereas protein-ligand binding affinities have long-established prominence, binding rate constants and binding mechanisms have gained increasing attention in recent years. Both new computational methods and new experimental techniques have been developed to characterize the latter properties. It is now realized that binding mechanisms, like binding rate constants, can and should be quantitatively determined. In this review, we summarize studies and synthesize ideas on several topics in the...
Show moreWhereas protein-ligand binding affinities have long-established prominence, binding rate constants and binding mechanisms have gained increasing attention in recent years. Both new computational methods and new experimental techniques have been developed to characterize the latter properties. It is now realized that binding mechanisms, like binding rate constants, can and should be quantitatively determined. In this review, we summarize studies and synthesize ideas on several topics in the hope of providing a coherent picture of and physical insight into binding kinetics. The topics include microscopic formulation of the kinetic problem and its reduction to simple rate equations; computation of binding rate constants; quantitative determination of binding mechanisms; and elucidation of physical factors that control binding rate constants and mechanisms.
Show less - Date Issued
- 2017-05-22
- Identifier
- FSU_pmch_28375732, 10.1146/annurev-biophys-070816-033639, PMC5592114, 28375732, 28375732
- Format
- Citation
- Title
- A confidence building exercise in data and identifiability: Modeling cancer chemotherapy as a case study..
- Creator
-
Eisenberg, Marisa C, Jain, Harsh V
- Abstract/Description
-
Mathematical modeling has a long history in the field of cancer therapeutics, and there is increasing recognition that it can help uncover the mechanisms that underlie tumor response to treatment. However, making quantitative predictions with such models often requires parameter estimation from data, raising questions of parameter identifiability and estimability. Even in the case of structural (theoretical) identifiability, imperfect data and the resulting practical unidentifiability of...
Show moreMathematical modeling has a long history in the field of cancer therapeutics, and there is increasing recognition that it can help uncover the mechanisms that underlie tumor response to treatment. However, making quantitative predictions with such models often requires parameter estimation from data, raising questions of parameter identifiability and estimability. Even in the case of structural (theoretical) identifiability, imperfect data and the resulting practical unidentifiability of model parameters can make it difficult to infer the desired information, and in some cases, to yield biologically correct inferences and predictions. Here, we examine parameter identifiability and estimability using a case study of two compartmental, ordinary differential equation models of cancer treatment with drugs that are cell cycle-specific (taxol) as well as non-specific (oxaliplatin). We proceed through model building, structural identifiability analysis, parameter estimation, practical identifiability analysis and its biological implications, as well as alternative data collection protocols and experimental designs that render the model identifiable. We use the differential algebra/input-output relationship approach for structural identifiability, and primarily the profile likelihood approach for practical identifiability. Despite the models being structurally identifiable, we show that without consideration of practical identifiability, incorrect cell cycle distributions can be inferred, that would result in suboptimal therapeutic choices. We illustrate the usefulness of estimating practically identifiable combinations (in addition to the more typically considered structurally identifiable combinations) in generating biologically meaningful insights. We also use simulated data to evaluate how the practical identifiability of the model would change under alternative experimental designs. These results highlight the importance of understanding the underlying mechanisms rather than purely using parsimony or information criteria/goodness-of-fit to decide model selection questions. The overall roadmap for identifiability testing laid out here can be used to help provide mechanistic insight into complex biological phenomena, reduce experimental costs, and optimize model-driven experimentation.
Show less - Date Issued
- 2017-10-27
- Identifier
- FSU_pmch_28733187, 10.1016/j.jtbi.2017.07.018, PMC6007023, 28733187, 28733187, S0022-5193(17)30345-4
- Format
- Citation
- Title
- Construction and Optimization of a Large Gene Coexpression Network in Maize Using RNA-Seq Data.
- Creator
-
Huang, Ji, Vendramin, Stefania, Shi, Lizhen, McGinnis, Karen M
- Abstract/Description
-
With the emergence of massively parallel sequencing, genomewide expression data production has reached an unprecedented level. This abundance of data has greatly facilitated maize research, but may not be amenable to traditional analysis techniques that were optimized for other data types. Using publicly available data, a gene coexpression network (GCN) can be constructed and used for gene function prediction, candidate gene selection, and improving understanding of regulatory pathways....
Show moreWith the emergence of massively parallel sequencing, genomewide expression data production has reached an unprecedented level. This abundance of data has greatly facilitated maize research, but may not be amenable to traditional analysis techniques that were optimized for other data types. Using publicly available data, a gene coexpression network (GCN) can be constructed and used for gene function prediction, candidate gene selection, and improving understanding of regulatory pathways. Several GCN studies have been done in maize (), mostly using microarray datasets. To build an optimal GCN from plant materials RNA-Seq data, parameters for expression data normalization and network inference were evaluated. A comprehensive evaluation of these two parameters and a ranked aggregation strategy on network performance, using libraries from 1266 maize samples, were conducted. Three normalization methods and 10 inference methods, including six correlation and four mutual information methods, were tested. The three normalization methods had very similar performance. For network inference, correlation methods performed better than mutual information methods at some genes. Increasing sample size also had a positive effect on GCN. Aggregating single networks together resulted in improved performance compared to single networks.
Show less - Date Issued
- 2017-09-01
- Identifier
- FSU_pmch_28768814, 10.1104/pp.17.00825, PMC5580776, 28768814, 28768814, pp.17.00825
- Format
- Citation
- Title
- Super-delta: a new differential gene expression analysis procedure with robust data normalization..
- Creator
-
Liu, Yuhang, Zhang, Jinfeng, Qiu, Xing
- Abstract/Description
-
Normalization is an important data preparation step in gene expression analyses, designed to remove various systematic noise. Sample variance is greatly reduced after normalization, hence the power of subsequent statistical analyses is likely to increase. On the other hand, variance reduction is made possible by borrowing information across all genes, including differentially expressed genes (DEGs) and outliers, which will inevitably introduce some bias. This bias typically inflates type I...
Show moreNormalization is an important data preparation step in gene expression analyses, designed to remove various systematic noise. Sample variance is greatly reduced after normalization, hence the power of subsequent statistical analyses is likely to increase. On the other hand, variance reduction is made possible by borrowing information across all genes, including differentially expressed genes (DEGs) and outliers, which will inevitably introduce some bias. This bias typically inflates type I error; and can reduce statistical power in certain situations. In this study we propose a new differential expression analysis pipeline, dubbed as super-delta, that consists of a multivariate extension of the global normalization and a modified t-test. A robust procedure is designed to minimize the bias introduced by DEGs in the normalization step. The modified t-test is derived based on asymptotic theory for hypothesis testing that suitably pairs with the proposed robust normalization. We first compared super-delta with four commonly used normalization methods: global, median-IQR, quantile, and cyclic loess normalization in simulation studies. Super-delta was shown to have better statistical power with tighter control of type I error rate than its competitors. In many cases, the performance of super-delta is close to that of an oracle test in which datasets without technical noise were used. We then applied all methods to a collection of gene expression datasets on breast cancer patients who received neoadjuvant chemotherapy. While there is a substantial overlap of the DEGs identified by all of them, super-delta were able to identify comparatively more DEGs than its competitors. Downstream gene set enrichment analysis confirmed that all these methods selected largely consistent pathways. Detailed investigations on the relatively small differences showed that pathways identified by super-delta have better connections to breast cancer than other methods. As a new pipeline, super-delta provides new insights to the area of differential gene expression analysis. Solid theoretical foundation supports its asymptotic unbiasedness and technical noise-free properties. Implementation on real and simulated datasets demonstrates its decent performance compared with state-of-art procedures. It also has the potential of expansion to be incorporated with other data type and/or more general between-group comparison problems.
Show less - Date Issued
- 2017-12-21
- Identifier
- FSU_pmch_29268715, 10.1186/s12859-017-1992-2, PMC5740711, 29268715, 29268715, 10.1186/s12859-017-1992-2
- Format
- Citation
- Title
- Perceiving the Usefulness of the National Cancer Institute Metathesaurus for Enriching NCIt with Topological Patterns.
- Creator
-
He, Zhe, Chen, Yan, Geller, James
- Abstract/Description
-
The National Cancer Institute Thesaurus (NCIt), developed and maintained by the National Cancer Institute, is an important reference terminology in the cancer domain. As a controlled terminology needs to continuously incorporate new concepts to enrich its conceptual content, automated and semi-automated methods for identifying potential new concepts are in high demand. We have previously developed a topological-pattern-based method for identifying new concepts in a controlled terminology to...
Show moreThe National Cancer Institute Thesaurus (NCIt), developed and maintained by the National Cancer Institute, is an important reference terminology in the cancer domain. As a controlled terminology needs to continuously incorporate new concepts to enrich its conceptual content, automated and semi-automated methods for identifying potential new concepts are in high demand. We have previously developed a topological-pattern-based method for identifying new concepts in a controlled terminology to enrich another terminology, using the UMLS Metathesaurus. In this work, we utilize this method with the National Cancer Institute Metathesaurus to identify new concepts for NCIt. While previous work was only oriented towards identifying candidate import concepts for human review, we are now also adding an algorithmic method to evaluate candidate concepts and reject a well defined group of them.
Show less - Date Issued
- 2017-01-01
- Identifier
- FSU_pmch_29295222, PMC5785238, 29295222, 29295222
- Format
- Citation