All things being equal
The concept of null hypothesis is a stumbling block to many statistics students. Perhaps people resist the notion of null hypothesis because it is the opposite of what they expect. The null hypothesis merely states that nothing special is going on. One proves that something significant is occurring by rejecting the null hypothesis. That is, we demonstrate that something is happening by rejecting the hypothesis that nothing is occurring. Talk about bending over backward!
The null hypothesis deserves more respect than it gets. I believe most of us have a mindset that neglects the contingent nature of existence. This bias toward meaningfulness causes people like Carl Jung to confuse random coincidence with significant synchronicity. Instead of postulating some weird "alignment" of universal forces, why couldn't Jung accept that in a lifetime of occurrences some coincidences would be more startling and remarkable than others? We do, after all, reside in an environment loaded with circumstance. Every day we have thousands of thoughts, meet dozens (hundreds?) of people, see tens of thousands of images, read hundreds (thousands?) of words, and make innumerable choices. Such a combinatorial plethora is all but certain to generate (at thoroughly unpredictable intervals) the occasionally startling coincidence. The startling coincidence will have just "happened" and will have neither significance nor meaning.
Perhaps this point of view is disappointing or even unacceptable to those who prefer to cherish coincidences and endow them with Jungian significance. For those who can't let go of the idea that such occurrences have to mean something, we can ask what happened to the random outcomes. Do they not occur? Are there truly no coincidences? Or is there a moderate position that declares some occurrences are significant and some are not? In this case, how can we tell the difference? And if we can't, how meaningful can the difference be?
The null hypothesis is more than a simple statement that nothing is going on. Its role in statistics is to provide a neutral baseline from which alternative hypotheses are evaluated. For example, suppose you are responsible for testing the claim that a new drug is efficacious. Naturally, the null hypothesis would be that the new drug makes no difference. You conduct a series of drug trials and find that those patients who received the drug did slightly better than those who received a placebo. How do you decide that the improvement was large enough to be meaningful? You return to the null hypothesis, the claim that there is no effect, and calculate the probability that the improvement could have occurred purely by chance. If you find that the improvement could have occurred only 5% of the time by mere chance, you would be justified in saying that the drug is better than the placebo. (Statisticians refer to the 5% threshold as the level of significance. The choice of level of significance is a judgment call, although 5% and 1% are traditionally the most popular.)
The null hypothesis is not a belligerent option. It is a touchstone or standard against which rival claims are gauged. If an alternative does not show itself to be sufficiently remarkable relative to the null hypothesis, then the alternative is not deemed worthy of provisional acceptance. That is, we do not reject the null hypothesis (nothing is happening) for the alternative hypothesis (something is going on) if our experimental results could easily occur under the null hypothesis. In terms of our drug-testing example, why would you accept the purported efficacy of the drug if the observed improvement could have occurred 40% of the time by mere chance? Sure, a 40% chance is less than even odds in favor of nothing happening, but it is still much too high to warrant much faith in the treatment. Statisticians routinely set the significance bar at 5% or even 1% to ensure that we do not reject the null hypothesis too casually.
A pox on all houses
Unfortunately for statisticians, the public most often encounters measures of significance in political polls, surveys whose results are controversial and contentious by their very nature. Poll results are usually stated with the caveat that they have been computed with a 95% confidence level. In other words, there is only a 5% probability (there's that 5% again) that the results are wrong by more than a specified amount (the specified amount is usually given as plus-or-minus a certain number of percentage points, the number of points determined by the poll's sampling size). Stated another way, if the poll were repeated multiple times, the results would be seriously wrong one time out of twenty (on the average). Given that political polls are conducted frequently during the most heated contests, we run into the unhappy situation where bitter accusations of bias or incompetence (from whichever candidate is trailing in the polls) are aligned with just enough divergent results (every twentieth, on the average) to cause people to throw up their hands and declare that pols, polls, and pollsters are all reprehensible. (I have not even raised the point that some polls are indeed conducted by hirelings who skew the outcomes to favor the candidate who hired them. The point that politicians may be scoundrels hardly needs to be made, as examples are conveniently numerous these days.)
The other unfortunate factor in political polling, quite apart from biased pollsters and acrimonious debates concerning even responsible polling, is that polls are by their nature no more than snapshots. A poll that finds 45% of the voters in favor of Candidate A in October is not suddenly invalidated if Candidate A receives 51% of the vote in November. For all we know (and there are statistical measures to help us gauge how much we can reasonably know), exactly 45% of the voters were in favor of Candidate A in October. The candidate simply picked up another six percentage points of support between the poll and the election. Still, the consequence of such contrasts is that polls are routinely regarded as having been proved wrong after the fact. That is really too bad, because responsibly conducted polls (in most cases, polls not sponsored by a particular candidate or cause) provide useful information on the opinions of the electorate. As I said, they are snapshots, not predictions.
Speaking of predictions, there is another venue in which the poor null hypothesis is routinely treated with abuse. The entire field of psychic research is particularly unfriendly toward the null hypothesis that nothing is going on. Psychic researchers have been reduced in recent decades to searching through their data for subtle signs that something might be happening, attempting to tease out some shred of significance in anything slightly out of the ordinary. This is a good point at which to recall that most statistical tests expect the null hypothesis to be incorrectly rejected about 5% of the time anyway, just by chance. Of such Type I errors entire careers have been constructed. The diligent psychic researcher, however, will find that the false positives will eventually settle down at the unmeaningful 5% level as he or she continues to investigate. A notable example is Dr. Susan Blackmore, who eventually abandoned research in parapsychology for the more fertile field of consciousness. (See in particular her short essay on giving up parapsychology.)
The most parsimonious explanation for the longterm and continuing failure of parapsychological research is that they are searching for something that is simply not there. Rare examples like Susan Blackmore notwithstanding, I confidently predict that psychic research is here to stay. Its devotees are too emotionally invested in the idea that coincidences, lucky guesses, and intuition are deeply significant representatives of profound and underdeveloped human powers. The null hypothesis is a more satisfactory explanation because it is simple and sensible. What it lacks, however, is allure and mystery, so the null hypothesis will continue to be rejected by those whom it fails to satisfy.
I am confident in my prediction, but I will not, however, claim that I am clairvoyant if it comes true.