: Received: 17 January 2019 / Approved: 21 January 2019 / Online: 21 January 2019 (09:21:28 CET)
: Received: 31 May 2019 / Approved: 31 May 2019 / Online: 31 May 2019 (10:37:48 CEST)
How to cite:
Burgess, M.A.; Chapman, A.C. Stratied Finite Empirical Bernstein Sampling. Preprints2019, 2019010202 (doi: 10.20944/preprints201901.0202.v2).
Burgess, M.A.; Chapman, A.C. Stratied Finite Empirical Bernstein Sampling. Preprints 2019, 2019010202 (doi: 10.20944/preprints201901.0202.v2).
We derive a concentration inequality for the uncertainty in the mean computed by stratified random sampling, and provide an online sampling method based on this inequality. Our concentration inequality is versatile and considers a range of factors including: the data ranges, weights, sizes of the strata, the number of samples taken, the estimated sample variances, and whether strata are sampled with or without replacement. Sequentially choosing samples to minimize this inequality leads to a online method for choosing samples from a stratified population. We evaluate and compare the effectiveness of our method against others for synthetic data sets, and also in approximating the Shapley value of cooperative games. Results show that our method is competitive with the performance of Neyman sampling with perfect variance information, even without having prior information on strata variances. We also provide a multidimensional extension of our inequality and discuss future applications.
Concentration Inequality, Empirical Bernstein Bound, Stratified Random Sampling, Shapley Value Approximation
MATHEMATICS & COMPUTER SCIENCE, Probability and Statistics
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.