Optimal Inference in Large-Scale Problems

Seminarium: 
Analysis of Large Data Sets
Osoba referująca: 
Daniel Yekutieli (Tel Aviv University)
Data: 
piątek, 16. Kwiecień 2021 - 15:30
Sala: 
https://lu-se.zoom.us/j/65067339175
Opis: 

Bayesian modeling is ubiquitous in large-scale problems even when frequentist criteria are in mind for evaluating the performance of a procedure. In particular, regularized estimation methods, that may be derived by eliciting a prior distribution on the model parameters, have been shown especially effective for analyzing large data.

Appealing to Robbins’s compound decision theory, we introduce a theoretical framework for deriving optimal Bayes rules in which the prior distribution consists of permutations of the parameter vector. For the special case of “symmetric” statistical problems, we show that our Bayes rules also minimize the frequentist Risk for any fixed parameter vector configuration.

Our main applicable contribution is the introduction of nonparametric deconvolution methodology, based on hierarchical Bayes modeling, that approximates the marginal parameter distribution. We use this methodology to approximate the theoretical Bayes rule. Our methodology is shown to be particularly effective in low-signal high-dimensional problems in which, even though it is difficult to estimate the components of the parameter vector, we are still able to tease out the marginal distribution of the parameter vector and thus, the resulting Bayes rules perform better than state of the art shrinkage estimators. Furthermore, as large-scale problems tend to be approximately symmetric, our Bayes rules provide near-optimal frequentist performance.

For concreteness and clarity, I will present the theoretical framework and hierarchical Bayes modeling for a High-dimensional logistic regression and demonstrate its application on several simulated examples.