# Common Errors in Statistics and How to Avoid Them - Good P.I

**Download**(direct link)

**:**

**76**> 77 78 79 80 81 82 .. 90 >> Next

5. THE RELATIONSHIP BETWEEN CROSS-VALIDATION AND THE JACKKNIFE

Efron (1982) conjectured that the cross-validation and jackknife estimates of excess error are asymptotically close. Gong (1982) proved Efron’s conjecture. Unfortunately, the regularity conditions stated there do not hold for Gregory’s rule. The conjecture seems to hold for Gregory’s rule, however, as evidenced in Figure 4, a scatterplot of the jackknife and crossvalidation estimates of the first 100 experiments of simulation 1.1. The plot shows points hugging the 45° line, whereas a scatterplot of the bootstrap and cross-validation exhibits no such behavior.

APPENDIX B EXCESS ERROR ESTIMATION IN FORWARD LOGISTIC REGRESSION 185

0.40+

0.40 +

0.30+

jack 0.20+

0.10+

_ * 0.00+ *

0.30

boot 0.20 +

0.10+ 5

*

5

5

0.00 2

0.00 0.10 0.20 0.30 0.40

cross

0.00 0.10 0.20 0.30 0.40

cross

FIGURE 4 Scatterplots to Compare rcorss, rjact and rboot. The scatterplots summarize the relationships among the three estimates for the first 100 experiments of simulation 1.1. The numerals indicate the number of observations; * indicates greater than 9.

6. CONCLUSIONS

Because complicated prediction rules depend intricately on the data and thus have grossly optimistic apparent errors, error rate estimation for complicated prediction rules is an important problem. Cross-validation is a time-honored tool for improving the apparent error. This article compares cross-validation with two other methods, the jackknife and the bootstrap. With the help of increasingly available computer power, all three methods are easily applied to Gregory’s complicated rule for predicting the outcome of chronic hepatitis. Simulations suggest that whereas the jack-knife and cross-validation do not offer significant improvement over the apparent error, the bootstrap shows substantial gain.

REFERENCES

Efron B. The Jackknife, the Bootstrap, and Other Resampling Plans, Philadelphia: Society for Industrial and Applied Mathematics 1982.

“Estimating the Error Rate of a Prediction Rule: Improvements on CrossValidation,” Journal of the American Statistical Association, 1983; 78:316-331.

Friedman JR. “A Recursive Partitioning Decision Rule for Nonparametric Classification,” IEEE Transactions on Computers, C-26, 1977; 404-408.

Geisser S. “The Predictive Sample Reuse Method With Applications,” Journal of the American Statistical Association, 1975; 70:320-328.

Gong G. “Cross-Validation, the Jackknife, and the Bootstrap: Excess Error Estimation in Forward Logistic Regression,” unpublished Ph.D. thesis, Stanford University 1982.

Rao CR. Linear Statistical Inference and Its Applications, New York: John Wiley 1973.

Stone M. “Cross-Validatory Choice and Assessment of Statistical Predictions,” Journal of the Royal Statistical Society, 1974; 36:111-147.

186 APPENDIX B EXCESS ERROR ESTIMATION IN FORWARD LOGISTIC REGRESSION

Glossary, Grouped by Related but Distinct Terms

ACCURACY AND PRECISION

An accurate estimate is close to the estimated quantity. A precise interval estimate is a narrow one. Precise measurements made with a dozen or more decimal places may still not be accurate.

DETERMINISTIC AND STOCHASTIC

A phenomenon is deterministic when its outcome is inevitable and all observations will take specific value.1 A phenomenon is stochastic when its outcome may take different values in accordance with some probability distribution.

DICHOTOMOUS, CATEGORICAL, ORDINAL, METRIC DATA

Dichotomous data have two values and take the form “yes or no,” “got better or got worse.”

Categorical data have two or more categories such as yes, no, and undecided. Categorical data may be ordered (opposed, indifferent, in favor) or unordered (dichotomous, categorical, ordinal, metric).

Preferences can be placed on an ordered or ordinal scale such as strongly opposed, opposed, indifferent, in favor, strongly in favor.

Metric data can be placed on a scale that permits meaningful subtraction; for example, while “in favor” minus “indifferent” may not be meaningful, 35.6 pounds minus 30.2 pounds is.

Metric data can be grouped so as to evaluate them by statistical methods applicable to categorical or ordinal data. But to do so would be

1 These observations may be subject to measurement error.

GLOSSARY, GROUPED BY RELATED BUT DISTINCT TERMS 187

to throw away information and also reduce the power of any tests and the precision of any estimates.

DISTRIBUTION, CUMULATIVE DISTRIBUTION, EMPIRICAL DISTRIBUTION, LIMITING DISTRIBUTION

Suppose we were able to examine all the items in a population and record a value for each one to obtain a distribution of values. The cumulative distribution function of the population F[x] denotes the probability that an item selected at random from this population will have a value less than or equal to x. 0 < F[x] < 1. Also, if x < y, then F[x] < F[y].

The empirical distribution, usually represented in the form of a cumulative frequency polygon or a bar plot, is the distribution of values observed in a sample taken from a population. If Fn[x] denotes the cumulative distribution of observations in a sample of size n, then as the size of the sample increases we have Fn[x] ^ F[x].

**76**> 77 78 79 80 81 82 .. 90 >> Next