zbMATH — the first resource for mathematics

Geometry Search for the term Geometry in any field. Queries are case-independent.
Funct* Wildcard queries are specified by * (e.g. functions, functorial, etc.). Otherwise the search is exact.
"Topological group" Phrases (multi-words) should be set in "straight quotation marks".
au: Bourbaki & ti: Algebra Search for author and title. The and-operator & is default and can be omitted.
Chebyshev | Tschebyscheff The or-operator | allows to search for Chebyshev or Tschebyscheff.
"Quasi* map*" py: 1989 The resulting documents have publication year 1989.
so: Eur* J* Mat* Soc* cc: 14 Search for publications in a particular source with a Mathematics Subject Classification code (cc) in 14.
"Partial diff* eq*" ! elliptic The not-operator ! eliminates all results containing the word elliptic.
dt: b & au: Hilbert The document type is set to books; alternatively: j for journal articles, a for book articles.
py: 2000-2015 cc: (94A | 11T) Number ranges are accepted. Terms can be grouped within (parentheses).
la: chinese Find documents in a given language. ISO 639-1 language codes can also be used.

a & b logic and
a | b logic or
!ab logic not
abc* right wildcard
"ab c" phrase
(ab c) parentheses
any anywhere an internal document identifier
au author, editor ai internal author identifier
ti title la language
so source ab review, abstract
py publication year rv reviewer
cc MSC code ut uncontrolled term
dt document type (j: journal article; b: book; a: book article)
Dimension reduction strategies for analyzing global gene expression data with a response. (English) Zbl 0999.62090
Summary: The analysis of global gene expression data from microarrays is breaking new ground in genetics research, while confronting modelers and statisticians with many critical issues. We consider data sets in which a categorical or continuous response is recorded, along with gene expression, on a given number of experimental samples. Data of this type are usually employed to create a prediction mechanism for the response based on gene expression, and to identify a subset of relevant genes. This defines a regression setting characterized by a dramatic under-resolution with respect to the predictors (genes), whose number exceeds by orders of magnitude the number of available observations (samples). We present a dimension reduction strategy that, under appropriate assumptions, allows us to restrict attention to a few linear combinations of the original expression profiles, and thus to overcome under-resolution. These linear combinations can then be used to build and validate a regression model with standard techniques. Moreover, they can be used to rank original predictors, and ultimately to select a subset of them through comparison with a background `chance scenario’ based on a number of independent randomizations. We apply this strategy to publicly available data on leukemia classification.

62P10Applications of statistics to biology and medical sciences
Full Text: DOI
[1] O. Alter, P.O. Brown, D. Botstein, Singular value decomposition for genome-wide expression data processing and modeling, Proceedings of the National Academy of Sciences, vol. 97, 2000, p. 10101
[2] F. Chiaromonte, R.D. Cook, Sufficient dimension reduction and graphics in regression, Ann. Inst. Stat. Math., in press · Zbl 1047.62066
[3] F. Chiaromonte, R.D. Cook, B. Li, Sufficient dimension reduction in regressions with categorical predictors, Ann. Stat., in press · Zbl 1012.62036
[4] F. Chiaromonte, Structures and exhaustive reductions: a general framework for for the simplification of multivariate data, submitted for publication
[5] L. Chin-Shang, J.M.G. Taylor, Ridge regression for the classification of tumors using gene expression data, ENAR/IMS Spring meeting 2001, Charlotte, NC
[6] Cook, R. D.: Regression graphics. (1998) · Zbl 0903.62001
[7] S. Dudoit, J. Fridlyand, T.P. Speed, Comparison of discriminant methods for the classification of tumors using gene expression data, J. Am. Stat. Assoc., in press · Zbl 1073.62576
[8] Golub, T. R.; Slonim, D. K.; Tamayo, P.; Huard, C.; Gaasenbeek, M.; Mesirov, J. P.; Coller, H.; Loh, M. L.; Dowining, J. R.; Caligiuri, M. A.; Bloomfield, C. D.; Lander, E. S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531 (1999)
[9] N.S. Holter, M. Mitra, A. Maritan, M. Cieplak, J.R. Banavar, N.V. Fedoroff, Fundamental patterns underlying gene expression profiles: simplicity from complexity, Proceedings of the National Academy of Sciences, vol. 97, 2000, p. 8409
[10] T. Hastie, R. Tibshirani, M.B. Eisen, A. Alizadeh, R. Levy, L. Staud, W.C. Chan, D. Botstein, P. Brown, Gene Shaving as a method for identifying distinct sets of genes with similar expression patterns, Genome Biology 1 (2) (2000): research 0003.1--0003.21
[11] Li, K. C.: Sliced inverse regression for dimension reduction (with discussion). J. am. Stat. assoc. 86, 316 (1991) · Zbl 0742.62044
[12] D.V. Nguyen, D.M. Rocke, Tumor classification by partial least squares using microarray gene expression data, CAMDA 2001. Available from http://handel.cipic.ucdavis.edu/\simdmrocke
[13] Perou, C. M.; Sorile, T.; Eisen, M. B.; Van De Rijn, M.; Jeffrey, S. S.; Rees, C. A.; Pollack, J. R.; Ross, D. T.; Johnsen, H.; Aksien, L. A.; Fluge, O.; Pergamenschikov, A.; Williams, C.; Zhu, S. X.; Lenning, P. E.; Borresen-Dale, A.; Brown, P. O.; Botstein, D.: Molecular portraits of human breast tumors. Nature 406, 747 (2000)
[14] Schott, J.: Determining the dimensionality in sliced inverse regression. J. am. Stat. assoc. 89, 141 (1994) · Zbl 0791.62069
[15] Velilla, S.: Assessing the number of linear components in a general regression problem. J. am. Stat. assoc. 93, 1088 (1998) · Zbl 1063.62553
[16] M. West et al., DNA microarray data analysis and regression modeling for genetic expression profiling, CAMDA 2000. Available from http://www.stat.duke.edu/bioinformatics/bayes.html