# Common Errors in Statistics and How to Avoid Them - Good P.I

**Download**(direct link)

**:**

**75**> 76 77 78 79 80 81 .. 90 >> Next

in each confidence interval represents the value of the estimate.

and the dimension of p was increased to p = 6 and S, bo, and b given in (4.3). For larger sample sizes, bias corrections to the apparent error became less important. It is still interesting, however, to compare mean squared errors. For all six simulations, I plot RMSE1’s in Figure 2 and RMSE2’s in Figure 3. It is interersting to note that the ordering noticed in simulation 1.1 of the root mean squared error of the five estimates also held in the other five simulations. That is,

and RMSE1( fboot) is about one-third of the distance between RMSE1( fideal)and RMSE1( fapp). Similar remarks hold for RMSE2. Crossvalidation and the jackknife offer no improvement over the apparent error, whereas the improvement given by the bootstrap is substantial.

The superiority of the bootstrap over cross-validation has been observed in other problems. Efron (1983) discussed estimates of excess error and

RMSEi (app) ~ RMSE1 (('cross ) ~ RMSE1 (jack ),

APPENDIX B EXCESS ERROR ESTIMATION IN FORWARD LOGISTIC REGRESSION 183

1.1

1.2

1.3

2.1

2.2

2.3

0.00 0.04 0.08 0.12 0.16 0.20

FIGURE 3 95% (nonsimultaneous) Confidence Intervals for RMSE2. In each set of simulations, there are four confidence intervals for, respectively, apparent (A), cross-validation (C), jackknife (J), and bootstrap (B) estimates of the expected excess error. Notice that rapp = 0, so RMSE2(rapp) is the expected excess error, a constant; the “confidence interval” for RMSE2(rapp) is a single value, indicated by a single bar. In addition, RMSE2(rideal) = 0 and its confidence intervals are not shown. Some of the bootstrap confidence intervals are so small that they are indistinguishable from single bars.

performed several simulations with a flavor similar to mine. I report on only one of his simulations here. When the prediction rule is the usual Fisher discriminant and the training sample consists of 14 observations that are equally likely from N((--, 0), I) or N((+-, 0), I), then the RMSE1 of apparent, cross-validation, bootstrap, and ideal estimates are, respectively, 0.149, 0.144, 0.134, and 0.114. Notice that the RMSE1’s of cross-validation and apparent estimates are close, whereas the RMSE1 of the bootstrap estimate is about halfway between that of the ideal and apparent estimates.

In the remainder of this section, I discuss the sufficiency of the number of bootstrap replications and the number of experiments.

Throughout the simulations, I used B = 100 bootstrap replications for each experiment. Denote

|A

I 1-------1C

I—I—IJ

I—I—I I--1—I

I--1—I

I—I—I

I—I—I

I-----------------------------------------------------------------------------------------------------------------------------------------------------------1-----------------------------------------------------------------------------------------------------------------------------------------------------------1-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------1------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------h

184 APPENDIX B EXCESS ERROR ESTIMATION IN FORWARD LOGISTIC REGRESSION

1 B

- = tX R*, M (B) = MSE1 (fb ).

B b=1

Using a component-of-variance calculation (Gong 1982), for Simulation

1.1

M1/2(•) = 0.1070~ 0.1078 = M12(100);

so if we are interested in comparing root mean squared errors about the excess error, we need not perform more than B = 100 bootstrap replications.

In each simulation, I included 400 experiments and therefore used the approximation

1 400

MSE1 (fN E[ - R]2~ — - Re]2,

400 e=11

where re and Re are the estimate and true excess of the eth experiment. Figure 2 and 3 show 95% nonsimultaneous confidence intervals for RMSEi’s and RMSE2’s. Shorter intervals for RMSEi’s would be preferable, but obtaining them would be time-consuming. Four hundred experiments of simulation 1.1 with p = 4, n = 20, and B = 100 took 16 computer hours on the PDP-11/34 minicomputer, whereas 400 experiments of simulation 2.3 with p = 6, n = 60, and B = 100 took 72 hours. Halving the length of the confidence intervals in Figures 2 and 3 would require four times the number of experiments and four times the computer time. On the other hand, for each simulation in Figure 3, the confidence interval for RMSE2( rideai) is disjoint from that of RMSE2( rboot), and both and disjoint from the confidence intervals for RMSE2( rjack), RMSE2( rcross), and RMSE2( rapp). Thus, for RMSE2, we can convincingly argue that the number of experiments is sufficient.

**75**> 76 77 78 79 80 81 .. 90 >> Next