A blunderbus approach to criticism of statistics. . – book review

Terence Hines

Junk Science Judo. By Steven J. Milloy. CATO Institute, 2001. ISBN 1-930865-12-0. 215 pp. Hardcover, $18.95.

Junk Science Judo is an annoying and shallow book that will not provide the reader with anything like a full account of the problems with junk science and the means to combat it. The author discusses some of the usual outrages of the health hysteria mongers such as claims that cell phones cause brain cancer and that Alar was a carcinogen. But the book is oddly incomplete in this regard. Missing is any coverage of the claims that power lines cause cancer or that breast implants result in immune diseases. These are two of the clearest cases of hysteria-mongering, and their inclusion would have made the book much more compelling by giving crystal-clear examples of why the media-induced fears are unfounded. Oddly, the real target of this book seems to be a poorly defined boogeyman labeled “statistics.” Certainly, sloppy use and interpretation of statistics is one of the major problems at the heart of unfounded health fears. A book that would clearly explain the often subtle statistical errors made by promoters of health fears would be a valuable contribution. Unfortunately, this is nor that book.

Milloy’s crude broad-brush condemnation of statistics is nicely summed up by his oft-repeated phrase “statistics aren’t science.” This is like saying “tools aren’t carpentry.” True enough, but one is going to get damn little carpentry done without tools, even if tools can (duh!) be abused and used incorrectly. The real problem is that Milloy, as judged from his writing, simply doesn’t understand statistical techniques well enough to be able to write cogent criticisms of the poor statistical techniques used to support various health scares. In most cases when he uses the term statistics” he really means “correlations.” For example, on page 59 he stares, “No study that reports only statistical results can prove a cause-and-effect relationship” (emphasis added). This is a simply absurd statement and would never be made by an author who had even a basic knowledge of statistical procedures. There is an entire class of statistical analyses, called inferential statistics, which are designed precisely to allow resea rchers to draw causal inferences. Milloy’s comment would be true if he substituted “correlational” for “statistical.” It is certainly true that the finding of a correlation (the term doesn’t even appear in the book’s index) between two variables does not allow one to conclude that there is a causal relationship between the two correlated variables. But Milloy’s ignorance of this fundamental difference between correlations and inferential statistics renders his argument confusing, to say the least.

At other times, Milloy uses the term statistics” in slightly different ways. While he never discusses what inferential statistics are, and how they can be used properly to draw conclusions, he does discuss (in chapter 6) the concept of statistical significance and p values. A p value is a probability value ranging from 0 to 1 that gives the probability of the obtained result being due to chance. The higher the p value, the more likely the obtained result is to have been the result of chance factors. Putting it the other way around, the smaller the p value, the greater the probability is that the results are not due to chance factors–that they are due to the factors manipulated in the experiment. By general agreement, a p value of .05 or less is accepted as a “significant” result. When discussing p values (p. 108) Milloy makes another absurd statement: “How researchers calculate the p-value is not important.” This is like saying, going back to the tool and carpentry analogy, that it doesn’t matter how a carp enter makes a hole in the wall–a bulldozer is as good as a skill saw. In fact, there are hundreds of different statistical procedures that can be used to calculate a p value. Deciding which one to use on a particular set of data is far from a trivial problem. It is a problem that those of us who teach statistics probably spend more time on in class than anything else. If you do use the wrong procedure, you’ll get a p value that is simply wrong. And you will often be badly misled as to what your results really mean. I have found (Hines 1998, 2001) that using the wrong procedure to calculate a p value is very common in experiments that claim to support various pseudosciences.

The author’s ignorance of the real nature of, and problems with, statistics time and again prevents him from making his arguments against various health scares as effective as they should be. In chapter 11, “Tricks Are for Kids,” he discusses studies that claim to show that exposure to this or that substance causes cancer. A common procedure in such studies is to examine the relationship between exposure and the rate of numerous different types of cancers (or other ailments). When one, or maybe two, significant relationships are found, the “fact” that substance X “causes” cancer type Y is certified. This is the problem of multiple comparisons. If we accept a p value of .05 as indicating significance, we are also accepting that 5 percent of the time, a result will be “significant” simply by chance alone. So, if you look at the effects of exposure to, say, postage stamps on 120 different types of cancer even if postage stamps do not cause cancer, for 5 percent of the cancer types examined there will be a “sign ificant” relationship. Now, of course, on average three of these relationships will show that postage stamp exposure increases the risks of the cancers and three will show that exposure decreases the risks. The decreased risk findings just never get any press. This sort of serious statistical blunder is especially serious in studies claiming to link power lines to various cancers and PCB pollution to developmental disorders (Hines 2002). But Milloy barely mentions this problem.

In another instance, Milloy simply seems not to have read the relevant literature. On page 118 he is properly critical of estimates of the economic costs of cigarette smoking by noting that the claims that “differences in medical expenditures between smokers and nonsmokers are due only to smoking” are “probably not true” due to other differences between smokers and nonsmokers. But those huge estimates of the economic “costs” of smoking can be criticized on much more serious grounds. Reports of smoking-related costs are just that–reports of the costs only. They do not take into account the cost savings that result from smoking. Unfortunately, us non-smokers aren’t immortal. We’re going to die of something. The fact is that lung cancer, the major fatal disease of smokers, kills you relatively young and relatively fast, and thus relatively cheaply. So smokers generally die at a time when they have lived a productive life but before they have a chance to develop many of the chronic and debilitating, to say noth ing of extremely expensive, diseases of old age such as Alzheimer’s disease. Dead smokers are also less likely collect social security payment and retirement benefits. The point is not that it is “good” that smoking kills people. The point is that any rational economic analysis of the effects of a behavior like smoking must take into account both the real costs and the real cost savings associated with the behavior. For example, a recent Dutch study (Barendregt et al. 1997) estimated that the lifetime average health care costs for smokers is $83,700 and for non-smokers $97,200.

The numerous serious flaws in the logic and coverage of this book render it essentially useless as a guide to the detection of junk science. This is a real shame as the book does contain interesting and important nuggets of information. I was unaware, for example, that in studies of the risks of secondhand smoke the EPA arbitrarily changed the p value for significance from .05 to .075, thereby shifting a finding of the risks from “non-significant” to significant.” It appears that secondhand smoke isn’t a health risk– though in my view that doesn’t mean smoking shouldn’t be banned. Secondhand smoke is still annoying, and that is sufficient grounds for a ban–just as we would feel no compunction about banning drinkers at a bar from spitting part of their bourbon and water on the folks around them. As the book stands, it comes across as little more than an ill-thought-out temper tantrum against those damn “statistics.”

References

Barendregt, J.J., L. Bonneux, and P.J. van der Maas. 1997. The health care costs of smoking. New England Journal of Medicine 337, 1052-1057.

Hines, T.M. 1998. Comprehensive review of biorhythm theory. Psychological Reports 83, 19-64.

—–. 2001. The Doman-Delacato patterning treatment for brain damage. Scientific Review of Alternative Medicine 5, 80-89.

—–. 2002. Pseudoscience and the Paranormal. 2nd edition. Amherst NY: Prometheus Books.

Terence Hines is in the psychology department, Pace University, Pleasantville NY 10570-2799. A second edition of his book Pseudoscience and the Paranormal is being published by Prometheus this year.

COPYRIGHT 2002 Committee for the Scientific Investigation of Claims of the Paranormal

COPYRIGHT 2002 Gale Group