Below you will find pages that utilize the taxonomy term “Estimating_es_conf_moving_avg_bootstrap”
Bootstrap Methods: When Theory Meets Computation
September 10, 2021
The bootstrap is a trade: mathematical complexity for computational burden. Instead of deriving analytical formulas for sampling distributions, you simulate them.
The Idea
If you don’t know the sampling distribution of a statistic, approximate it by resampling from your data.
- Draw samples with replacement from the original data
- Compute your statistic on each resample
- The distribution of resampled statistics approximates the true sampling distribution
That’s it. The justification is more subtle than the procedure. Under regularity conditions, the bootstrap distribution converges to the true sampling distribution as sample size grows. This is non-parametric inference: you use the empirical distribution as a stand-in for the true distribution, without assuming a parametric form.
When I Use It
Bootstrap is my default tool when:
- I need confidence intervals for statistics with no closed-form variance
- Asymptotic theory doesn’t apply (small samples, non-standard statistics)
- I’m doing model selection via bootstrap cross-validation
- I’m working with censored data where standard errors are intractable
That last case is the one that matters most for my research.
The Computational Trade
Better to get the right answer slowly than the wrong answer quickly.
Deriving an analytical variance formula is hard. Sometimes it’s impossible for the statistic you actually care about. Bootstrap says: just compute the statistic 10,000 times on resampled data and look at the spread. With modern hardware, 10,000 resamples takes seconds.
The trade is almost always worth it.
My Thesis Work
My research uses bootstrap heavily. I’m working on reliability estimation for series systems where components fail and you don’t know which one caused the system failure. This is the masked failure data problem.
For these models, the MLE exists and you can compute it, but the standard variance formulas don’t. The Fisher information matrix involves expectations over the masking distribution that don’t simplify to anything closed-form.
Bootstrap gives me confidence intervals anyway. Resample the masked failure data, recompute the MLE on each resample, and use the distribution of bootstrapped MLEs to construct intervals. It’s not elegant, but it works, and “works” is the right criterion when the alternative is “no confidence intervals at all.”
IEEE Paper: Estimating Encrypted Search Confidentiality via Bootstrap
November 2, 2016
This is my first IEEE publication, co-authored with Professor Hiroshi Fujinoki. The problem: if you encrypt search queries but an adversary can observe the ciphertext traffic, how many queries do they need before a frequency attack succeeds?
We used the Moving Average Bootstrap (MAB) method to estimate that threshold. The idea is that encrypted search leaks frequency information (how often each ciphertext appears), and an adversary can correlate those frequencies against known plaintext distributions. The bootstrap lets us estimate confidence intervals on the number of observations needed without closed-form solutions.
This came out of my MS thesis work on encrypted search at SIU. The core question (how much does encrypted search actually leak?) turns out to be harder than it sounds, because the answer depends on the plaintext distribution, the query distribution, and how patient the adversary is. The bootstrap approach gives us a way to answer it empirically.
For more related work, see my research page and publications.