# Common Errors in Statistics and How to Avoid Them - Good P.I

**Download**(direct link)

**:**

**29**> 30 31 32 33 34 35 .. 90 >> Next

For example, for testing against K0, Lehmann [1999, p. 372] recommends the use of the Jonckheere-Terpstra statistic, the number of pairs in which an observation from one group is less than an observation from a higher-dose group. The penalty we pay for using this statistic and ignoring the actual values of the observations is a marked reduction in power for small samples and is a less pronounced loss for larger ones.

If there are just two samples, the test based on the Jonckheere-Terpstra statistic is identical to the Mann-Whitney test. For very large samples, with identically distributed observations in both samples, 100 observations would be needed with this test to obtain the same power as a permutation

64 PART II HYPOTHESIS TESTING AND ESTIMATION

test based on the original values of 95 observations. This is not a price one would want to pay in human or animal experiments.

HIGHER-ORDER EXPERIMENTAL DESIGNS

Similar caveats hold for the parametric ANOVA approach to the analysis of two-factor experimental design with two additions:

1. The sample sizes must be the same in each cell; that is, the design must be balanced.

2. A test for interaction must precede any test for main effects.

Imbalance in the design will result in the confounding of main effects with interactions. Consider the following two-factor model for crop yield:

Xijk _ m + a i + P j + g ij + e jjk

Now suppose that the observations in a two-factor experimental design are normally distributed as in the following diagram taken from Cornfield and Tukey (1956):

N(0,1)| N(2,1)

N(2,1)| N(0,1)

There are no main effects in this example—both row means and both column means have the same expectations, but there is a clear interaction represented by the two nonzero off-diagonal elements.

If the design is balanced, with equal numbers per cell, the lack of significant main effects and the presence of a significant interaction should and will be confirmed by our analysis. But suppose that the design is not in balance, that for every 10 observations in the first column, we have only one observation in the second. Because of this imbalance, when we use the F ratio or equivalent statistic to test for the main effect, we will uncover a false “row” effect that is actually due to the interaction between rows and columns. The main effect is confounded with the interaction.

If a design is unbalanced as in the preceding example, we cannot test for a “pure” main effect or a “pure” interaction. But we may be able to test for the combination of a main effect with an interaction by using the statistic that we would use to test for the main effect alone. This combined effect will not be confounded with the main effects of other unrelated factors.

Whether or not the design is balanced, the presence of an interaction may zero out a cofactor-specific main effect or make such an effect impos-

CHAPTER 5 TESTING HYPOTHESES: CHOOSING A TEST STATISTIC 65

sible to detect. More important, the presence of a significant interaction may render the concept of a single “main effect” meaningless. For example, suppose we decide to test the effect of fertilizer and sunlight on plant growth. With too little sunlight, a fertilizer would be completely ineffective. Its effects only appear when sufficient sunlight is present. Aspirin and warfarin can both reduce the likelihood of repeated heart attacks when used alone; you don’t want to mix them!

Gunter Hartel offers the following example: Using five observations per cell and random normals as indicated in Cornfield and Tukey’s diagram, a two-way ANOVA without interaction yields the following results:

Source df Sum of Squares F Ratio Prob > F

Row 1 0.15590273 0.0594 0.8104

Col 1 0.10862944 0.0414 0.8412

Error 17 44.639303

Adding the interaction term yields

Source df Sum of Squares F Ratio Prob > F

Row 1 0.155903 0.1012 0.7545

Col 1 0.108629 0.0705 0.7940

Row*col 1 19.986020 12.9709 0.0024

Error 16 24.653283

Expanding the first row of the experiment to have 80 observations

rather than 10, the main effects only table becomes

Source df Sum of Squares F Ratio Prob > F

Row 1 0.080246 0.0510 0.8218

Col 1 57.028458 36.2522 <.0001

Error 88 138.43327

But with the interaction term it is:

Source df Sum of Squares F Ratio Prob > F

Row 1 0.075881 0.0627 0.8029

Col 1 0.053909 0.0445 0.8333

row*col 1 33.145790 27.3887 <.0001

Error 87 105.28747

66 PART II HYPOTHESIS TESTING AND ESTIMATION

Independent Tests

Normally distributed random variables (as in Figure 7.1) have some remarkable properties:

• The sum (or difference) of two independent normally distributed random variables is a normally distributed random variable.

• The square of a normally distributed random variable has the chi-square distribution (to within a multiplicative constant); the sum of two variables with the chi-square distribution also has a chi-square distribution (with additional degrees of freedom).

• A variable with the chi-square distribution can be decomposed into the sum of several independent chi-square variables.

**29**> 30 31 32 33 34 35 .. 90 >> Next