ePCA
swMATH ID:  27830 
Software Authors:  Lydia T. Liu, Edgar Dobriban, Amit Singer 
Description:  ePCA: high dimensional exponential family PCA. Many applications involve large datasets with entries from exponential family distributions. Our main motivating application is photonlimited imaging, where we observe images with Poisson distributed pixels. We focus on Xray Free Electron Lasers (XFEL), a quickly developing technology whose goal is to reconstruct molecular structure. In XFEL, estimating the principal components of the noiseless distribution is needed for denoising and for structure determination. However, the standard method, Principal Component Analysis (PCA), can be inefficient in nonGaussian noise. Motivated by this application, we develop ePCA (exponential family PCA), a new methodology for PCA on exponential families. ePCA is a fast method that can be used very generally for dimension reduction and denoising of large data matrices with exponential family entries. We conduct a substantive XFEL data analysis using ePCA. We show that ePCA estimates the PCs of the distribution of images more accurately than PCA and alternatives. Importantly, it also leads to better denoising. We also provide theoretical justification for our estimator, including the convergence rate and the Marchenko – Pastur law in high dimensions. An opensource implementation is available 
Homepage:  https://arxiv.org/abs/1611.05550 
Source Code:  https://github.com/lydiatliu/epca 
Related Software:  OptShrink; Pyglrm; LowRankModels; softImpute; R; ScreeNOT; ElemStatLearn; Mcmcpack; WordNet; OBOE; Eigenstrat; denoiseR; lori; qut; NLopt; GPSeq; GitHub; PMA; nloptr; ggplot2 
Cited in:  11 Documents 
Standard Articles
1 Publication describing the Software, including 1 Publication in zbMATH  Year 

\(e\)PCA: high dimensional exponential family PCA. Zbl 1411.62376 Liu, Lydia T.; Dobriban, Edgar; Singer, Amit 
2018

all
top 5
Cited by 27 Authors
all
top 5
Cited in 9 Serials
Cited in 4 Fields
10  Statistics (62XX) 
2  Probability theory and stochastic processes (60XX) 
1  Linear and multilinear algebra; matrix theory (15XX) 
1  Functional analysis (46XX) 