×

A nonparametric empirical Bayes approach to joint modeling of multiple sources of genomic data. (English) Zbl 1135.62087

Summary: With the rapid accumulation of various high-throughput genomic and proteomic data, one is compelled to develop new statistical methods that can take advantage of existing multiple sources of data. In our motivating example, a chromatin-immunoprecipitation (ChIP) microarray experiment was conducted to detect binding target genes of a broad transcription regulator, leucine responsive regulatory protein (Lrp) in E. coli. In addition, a cDNA microarray dataset is available to compare gene expressions of the wild type with that of a mutant with the Lrp gene deleted in E. coli. It is biologically reasonable to assume that the genes with altered expressions are more likely to be regulated by Lrp than those with no expression change. Hence we aim to borrow information in the gene expression data to increase statistical power to detect the binding targets of Lrp.
We propose a novel joint model for protein-DNA binding data and gene expression data; under mild modeling assumptions, it is shown that the method is optimal, and equivalent to a joint likelihood ratio test. We compare the joint modeling with two existing methods of combining separate analyses. We adopt a nonparametric empirical Bayes (EB) method to draw statistical inference in the joint model; in particular, we propose a new method, maximum likelihood conditional on the binding data, to estimate two prior probabilities for the expression data, which are non-identifiable based on the expression data alone. We use simulated data to demonstrate the improved performance of the joint modeling over other approaches. Application to the Lrp data also shows better performance of the joint modeling than that of analyzing the binding data alone.

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
62G05 Nonparametric estimation
62C12 Empirical decision procedures; empirical Bayes procedures
92C40 Biochemistry, molecular biology

Software:

EBarrays
PDFBibTeX XMLCite
Full Text: Link