You are here

Comparative mRNA Expression Analysis Leveraging Known Biochemical Interactions

Title: Comparative mRNA Expression Analysis Leveraging Known Biochemical Interactions.
11 views
0 downloads
Name(s): Steppi, Albert Joseph, III, author
Zhang, Jinfeng, professor directing dissertation
Sang, Qing-Xiang, university representative
Wu, Wei, committee member
Niu, Xufeng, 1954-, committee member
Florida State University, degree granting institution
College of Arts and Sciences, degree granting college
Department of Statistics, degree granting department
Type of Resource: text
Genre: Text
Doctoral Thesis
Issuance: monographic
Date Issued: 2018
Publisher: Florida State University
Place of Publication: Tallahassee, Florida
Physical Form: computer
online resource
Extent: 1 online resource (78 pages)
Language(s): English
Abstract/Description: We present two studies incorporating existing biological knowledge into differential gene expression analysis that attempt to place the results within a broader biological context. The studies investigate breast cancer health disparity between differing ethnic groups by comparing gene expression levels in tumor samples from patients from different ethnic populations. We incorporate existing knowledge by making comparisons not just between individual genes, but between sets of related genes and networks of interacting genes. In the first study, a comparison is made between mRNA expression patterns in Asian and Caucasian American breast cancer samples in an attempt to better understand why there are significantly lower breast cancer incidence and mortality rates in Asian Americans compared to Caucasian Americans. In the second study, the expression levels of genes related to drug and xenobiotic metabolizing enzymes (DXME) are compared between African, Asian, and Caucasian American breast cancer patients. The expression of genes related to these enzymes has been found to significantly affect drug clearance and the onset of drug resistance. Both studies found differentially expressed genes and pathways that may be associated with health disparities between the three ethnic populations. A thorough investigation of the literature was made in order to understand the context in which these differences in gene expression could affect the development and progression of breast tumors, and to identify genes and pathways that may be differentially expressed between the ethnic groups in general but not associated with breast cancer. Many of the relevant differences in gene expression were found to be linked to factors such as diet and differences in body composition. The process of finding relevant pathways and sets of interacting genes to inform comparative mRNA expression analysis can be laborious and time consuming. The literature is expanding at an exponential rate, and there is little hope for research groups to be able to keep up with all of the latest research. It is becoming more common for journals to require authors to make their results available in public databases, but many results concerning biochemical interactions are only accessible in unstructured text. Extracting relationships and interactions from the biological literature using techniques from machine learning and natural language processing is an important and growing field of research. To gain a better understanding of this field, we participated in the BioCreative VI Track 4 challenge, which involved classifying PubMed abstracts that contain examples of protein-protein interactions that are affected by a mutation. We discuss the model we developed and the lessons learned while participating in the competition. The problem of acquiring sufficient quantities of quality labeled data is a great obstacle preventing the improvement of performance. We present a web application we are developing to streamline the annotation of entity-entity interactions in text. It makes use of a database of known interactions to locate passages that are likely to be relevant and offers a simple and concise user interface to minimize the cognitive burden on the annotator.
Identifier: 2018_Sp_Steppi_fsu_0071E_14522 (IID)
Submitted Note: A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy.
Degree Awarded: Spring Semester 2018.
Date of Defense: April 10, 2018.
Keywords: Cancer Health Disparity, Gene Expression, Protein-Protein Interactions, Text Annotation, Text Mining
Bibliography Note: Includes bibliographical references.
Advisory Committee: Jinfeng Zhang, Professor Directing Dissertation; Qing-Xiang Sang, University Representative; Wei Wu, Committee Member; Xufeng Niu, Committee Member.
Subject(s): Bioinformatics
Statistics
Computer science
Persistent Link to This Record: http://purl.flvc.org/fsu/fd/2018_Sp_Steppi_fsu_0071E_14522
Host Institution: FSU

Choose the citation style.
Steppi, A. J. (2018). Comparative mRNA Expression Analysis Leveraging Known Biochemical Interactions. Retrieved from http://purl.flvc.org/fsu/fd/2018_Sp_Steppi_fsu_0071E_14522