Bayesian Modeling for High Throughput Genomic Data.

Hu, Ming

Sangam Home
→
Institutional Repositories
→
Deep Blue Repositories - University of Michigan
→
View Item

dc.contributor	Qin, Zhaohui
dc.contributor	Abecasis, Goncalo
dc.contributor	Johnson, Timothy D.
dc.contributor	Kumar, Chandan
dc.contributor	Lin, Jiandie
dc.contributor	Taylor, Jeremy M.
dc.creator	Hu, Ming
dc.date	2011-01-18T16:20:50Z
dc.date	NO_RESTRICTION
dc.date	2011-01-18T16:20:50Z
dc.date	2010
dc.date
dc.date.accessioned	2022-05-19T13:29:54Z
dc.date.available	2022-05-19T13:29:54Z
dc.identifier	http://hdl.handle.net/2027.42/78939
dc.identifier.uri	http://localhost:8080/xmlui/handle/CUHPOERS/117302
dc.description	The explosion of high throughput genomic data in recent years has already altered our view of the extent and complexity of biology. Technologically specific features, heterogeneous data structures and massive sample sizes present great challenges and opportunities to develop novel statistical methodologies in computational biology. This dissertation presents three Bayesian modeling methods in high throughput genomic data analysis. In chapter 2, we develop a model-based gene expression query algorithm built under the Bayesian model selection framework. This algorithm is capable of detecting co-expression profiles under a subset of samples/experimental conditions. In addition, it allows linearly transformed expression patterns to be recognized and is robust in the presence of sporadic outliers in the data. Our simulation studies suggest that this method outperforms existing query tools. When we apply this new method to the Escherichia coli microarray compendium data, it identifies a majority of known regulons, as well as novel potential target genes of numerous key transcription factors. In chapter 3, we introduce a novel computational algorithm named Hybrid Motif Sampler (HMS), specifically designed for transcription factor binding sites (TFBS) motif discovery in ChIP-Seq data. HMS incorporates sequencing depth information to aid motif identification, allows intra-motif dependency to describe more accurately the underlying motif pattern and combines stochastic sampling and deterministic search to accelerate the computation process. Simulation studies demonstrate favorable performance of HMS compared to other existing methods. When applying HMS to real ChIP-Seq datasets, we find that the accuracy of existing TFBS motif patterns can be significantly improved. In chapter 4, we propose a spatial Poisson regression model to provide a portrait of base-level sequencing depth in RNA-Seq data. The model utilizes two random effects to explain the spatial correlation and the non-spatial variation and incorporates GC content effects into the mean structure for better fitting. Both simulation study and real data analysis demonstrate that this method can capture local genomic features that affect coverage depth, and therefore, offers improved quantification of the true underlying expression levels. The research in this dissertation demonstrates that Bayesian modeling methods have achieved great success and have the potential to accelerate biomedical research.
dc.description	Ph.D.
dc.description	Biostatistics
dc.description	University of Michigan, Horace H. Rackham School of Graduate Studies
dc.description	http://deepblue.lib.umich.edu/bitstream/2027.42/78939/1/hming_1.pdf
dc.format	6538053 bytes
dc.format	1373 bytes
dc.format	application/pdf
dc.format	text/plain
dc.format	application/pdf
dc.language	en_US
dc.subject	Bayesian Modeling
dc.subject	High Throughput Genomic Data
dc.subject	MCMC
dc.subject	ChIP-Seq
dc.subject	RNA-Seq
dc.subject	Microarray
dc.subject	Genetics
dc.subject	Public Health
dc.subject	Statistics and Numeric Data
dc.subject	Health Sciences
dc.subject	Science
dc.title	Bayesian Modeling for High Throughput Genomic Data.
dc.type	Thesis

Files in this item

Files	Size	Format	View
hming_1.pdf	6.538Mb	application/pdf	View/Open

This item appears in the following Collection(s)

Deep Blue Repositories - University of Michigan [17189]

Show simple item record

Search DSpace

Advanced Search

Browse

All of DSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

Bayesian Modeling for High Throughput Genomic Data.

Files in this item

This item appears in the following Collection(s)

Search DSpace

Browse

All of DSpace

This Collection