# Common Errors in Statistics and How to Avoid Them - Good P.I

**Download**(direct link)

**:**

**7**> 8 9 10 11 12 13 .. 90 >> Next

on your computer.

WHAT is a hypothesis?

A well-formulated hypothesis will be both quantifiable and testable—that is, involve measurable quantities or refer to items that may be assigned to mutually exclusive categories.

A well-formulated statistical hypothesis takes one of the following forms: “Some measurable characteristic of a population takes one of a specific set

CHAPTER 2 HYPOTHESES: THE WHY OF YOUR RESEARCH 11

of values.” or “Some measurable characteristic takes different values in different populations, the difference(s) taking a specific pattern or a specific set of values.”

Examples of well-formed statistical hypotheses include the following:

• “For males over 40 suffering from chronic hypertension, a 100mg daily dose of this new drug lowers diastolic blood pressure an average of 10 mm Hg.”

• “For males over 40 suffering from chronic hypertension, a daily dose of 100mg of this new drug lowers diastolic blood pressure an average of 10 mm Hg more than an equivalent dose of metoprolol.”

• “Given less than 2 hours per day of sunlight, applying from 1 to 10lb of 23-2-4 fertilizer per 1000 square feet will have no effect on the growth of fescues and Bermuda grasses.”

“All redheads are passionate” is not a well-formed statistical hypothesis—not merely because “passionate” is ill-defined, but because the word “All” indicates that the phenomenon is not statistical in nature.

Similarly, logical assertions of the form “Not all,” “None,” or “Some” are not statistical in nature. The restatement, “80% of redheads are passionate,” would remove this latter objection.

The restatements, “Doris J. is passionate,” or “Both Good brothers are 5'10" tall,” also are not statistical in nature because they concern specific individuals rather than populations (Hagood, 1941).

If we quantify “passionate” to mean “has an orgasm more than 95% of the time consensual sex is performed,” then the hypothesis “80% of redheads are passionate” becomes testable. Note that defining “passionate” to mean “has an orgasm every time consensual sex is performed” would not be provable as it is a statement of the “all or none” variety.

Finally, note that until someone succeeds in locating unicorns, the hypothesis “80% of unicorns are passionate” is not testable.

Formulate your hypotheses so they are quantifiable, testable, and statistical in nature.

How Precise Must a Hypothesis Be?

The chief executive of a drug company may well express a desire to test whether “our anti-hypertensive drug can beat the competition.” But to apply statistical methods, a researcher will need precision on the order of “For males over 40 suffering from chronic hypertension, a daily dose of 100 mg of our new drug will lower diastolic blood pressure an average of 10 mm Hg more than an equivalent dose of metoprolol.”

The researcher may want to test a preliminary hypothesis on the order of “For males over 40 suffering from chronic hypertension, there is a daily

12 PART I FOUNDATIONS

dose of our new drug which will lower diastolic blood pressure an average of 20 mm Hg.” But this hypothesis is imprecise. What if the necessary dose of the new drug required taking a tablet every hour? Or caused liver malfunction? Or even death? First, the researcher would conduct a set of clinical trials to determine the maximum tolerable dose (MTD) and then test the hypothesis, “For males over 40 suffering from chronic hypertension, a daily dose of one-third to one-fourth the MTD of our new drug will lower diastolic blood pressure an average of 20 mm Hg.”

A BILL OF RIGHTS

• Scientists can and should be encouraged to make subgroup analyses.

• Physicians and engineers should be encouraged to make decisions utilizing the findings of such analyses.

• Statisticians and other data analysts can and should rightly refuse to give their imprimatur to related tests of significance.

In a series of articles by Horwitz et al. [1998], a physician and his colleagues strongly criticize the statistical community for denying them (or so they perceive) the right to provide a statistical analysis for subgroups not contemplated in the original study protocol. For example, suppose that in a study of the health of Marine recruits, we notice that not one of the dozen or so women who received the vaccine contracted pneumonia. Are we free to provide a p value for this result?

Statisticians Smith and Egger [1998] argue against hypothesis tests of subgroups chosen after the fact, suggesting that the results are often likely to be explained by the “play of chance.” Altman [1998b, pp. 301-303], another statistician, concurs noting that “. . . the observed treatment effect is expected to vary across subgroups of the data . . . simply through chance variation” and that “doctors seem able to find a biologically plausible explanation for any finding.” This leads Horwitz et al. [1998] to the incorrect conclusion that Altman proposes we “dispense with clinical biology (biologic evidence and pathophysiologic reasoning) as a basis for forming subgroups.” Neither Altman nor any other statistician would quarrel with Horwitz et al.’s assertion that physicians must investigate “how do we [physicians] do our best for a particular patient.”

**7**> 8 9 10 11 12 13 .. 90 >> Next