Download (direct link):
For the two-sample case, we want a confidence interval based on the distribution of
where n, m, and snb, smb denote the sample sizes and standard deviations, respectively, of the bootstrap samples. Applying the Hall-Wilson corrections, we obtain narrower interval estimates that are more likely to contain the true value of the unknown parameter.
The bias-corrected and accelerated BCa interval due to Efron and Tibshirani  also represents a substantial improvement, though for
CHAPTER 4 ESTIMATION 47
samples under size 30, the interval is still suspect. The idea behind these intervals comes from the observation that percentile bootstrap intervals are most accurate when the estimate is symmetrically distributed about the true value of the parameter and the tails of the estimates distribution drop off rapidly to zero. The symmetric, bell-shaped normal distribution depicted in Figure 7.1 represents this ideal.
Suppose 0 is the parameter we are trying to estimate, 9 is the estimate, and we are able to come up with a monotone increasing transformation m such that m(0) is normally distributed about m(0). We could use this normal distribution to obtain an unbiased confidence interval, and then apply a back-transformation to obtain an almost-unbiased confidence interval.3
Even with these modifications, we do not recommend the use of the nonparametric bootstrap with samples of fewer than 100 observations. Simulation studies suggest that with small sample sizes, the coverage is far from exact and the endpoints of the intervals vary widely from one set of bootstrap samples to the next. For example, Tu and Zhang  report that with samples of size 50 taken from a normal distribution, the actual coverage of an interval estimate rated at 90% using the BCa bootstrap is 88%. When the samples are taken from a mixture of two normal distributions (a not uncommon situation with real-life data sets) the actual coverage is 86%. With samples of only 20 in number, the actual coverage is 80%.
More serious when trying to apply the bootstrap is that the endpoints of the resulting interval estimates may vary widely from one set of bootstrap samples to the next. For example, when Tu and Zhang drew samples of size 50 from a mixture of normal distributions, the average of the left limit of 1000 bootstrap samples taken from each of 1000 simulated data sets was 0.72 with a standard deviation of 0.16, and the average and standard deviation of the right limit were 1.37 and 0.30, respectively.
Even when we know the form of the population distribution, the use of the parametric bootstrap to obtain interval estimates may prove advantageous either because the parametric bootstrap provides more accurate answers than textbook formulas or because no textbook formulas exist.
Suppose we know that the observations come from a normal distribution and want an interval estimate for the standard deviation. We would draw repeated bootstrap samples from a normal distribution, the mean of which is the sample mean and the variance of which is the sample variance.
3 StataTM provides for bias-corrected intervals via its bstrap command. R- and S-Plus both include BCa functions. A SAS macro is available at http://www.asu.edu/it/fyi/research/ helpdocs/statistics/SAS/tips/jackboot.html.
48 PART II HYPOTHESIS TESTING AND ESTIMATION
As a practical matter, we would draw an element from an N(0,1) population, multiply by the sample standard deviation, and then add the sample mean to obtain an element of our bootstrap sample. By computing the standard deviation of each bootstrap sample, an interval estimate for the standard deviation of the population may be derived.
In many instances, we can obtain narrower interval estimates that have a greater probability of including the true value of the parameter by focusing on sufficient statistics, pivotal statistics, and admissible statistics.
A statistic T is sufficient for a parameter if the conditional distribution of the observations given this statistic T is independent of the parameter. If the observations in a sample are exchangeable, then the order statistics of the sample are sufficient; that is, if we know the order statistics x(i) < x(2)
< ... < x(n), then we know as much about the unknown population distribution as we would if we had the original sample in hand. If the observations are on successive independent binomial trials that end in either success or failure, then the number of successes is sufficient to estimate the probability of success. The minimal sufficient statistic that reduces the observations to the fewest number of discrete values is always preferred.
A pivotal quantity is any function of the observations and the unknown parameter that has a probability distribution that does not depend on the parameter. The classic example is Students t, whose distribution does not depend on the population mean or variance when the observations come from a normal distribution.