Most people would be surprised by the statement "science doesn't prove anything." You could misinterpret it as the remark of a closed-minded person who still believes the world is flat. However, to understand what I mean, you need to understand something about the methods involved in conducting research broadly speaking, we call it the scientific method. It's also important to understand these issues when you read about or hear reports on the results of research. It will help you to interpret the results for yourself.

The scientific method is difficult to concisely define, but the following three steps are generally involved:

- review the facts, theories and proposals on the subject,
- formulate a logical hypothesis that can be tested using experimental methods, and
- conduct an objective evaluation of the hypothesis on the basis of the experimental results. We will cover the last two steps in this article.

Many times, an objective evaluation of a hypothesis is difficult because exacting laws of cause and effect are generally unknown for the biological systems we study. Other difficulties arise due to natural variation in the environment and organisms (this was the subject of a previous Ag News and Views article). Finally, it's not possible to observe every conceivable event because of limited time and resources. For example, we might initiate some field trials to study the forage quality and yield of ten different bermudagrass varieties. It would be reasonable to initiate field trials at a couple of our farms, say the Red River Farm in Love County and the Pasture Demonstration Farm in Carter County. It's likely that the total land area occupied by these two field trials would be less than two acres. From this, you can see that scientific study of a population is generally restricted to a very limited subset of observation. In this example, less than two acres will be used within the much larger "population" of all the agricultural fields in Carter and Love counties that might grow bermudagrass for forage.

Statistical methods are the best tools to objectively test hypotheses. The statistical procedure for testing first requires that we concisely define the hypothesis. In a previous article, I used an example of designing an experiment to compare the forage yield of a new rye variety to an old standard variety. In this example, the hypothesis we test would be that the forage yield of the new variety is equal to the old variety. Statistically, this is referred to as the null hypothesis, because it is a statement of no difference. If we conclude that the null hypothesis is false, then an alternative hypothesis will be assumed to be true. In this case, the alternative hypothesis would be that the forage yield of the two cultivars is not equal. For every statistical test performed, null and alternative hypotheses are stated and all possible outcomes are accounted for by them, but technically, only the null hypothesis is tested.

One needs an objective criterion for rejecting or not rejecting the null hypothesis. In general, we first compute the appropriate test statistic from experimental measurements and compare it to a critical value for that statistic. Any calculated value larger than the critical value indicates that we will reject the null hypothesis. The magnitude of the critical value is determined by the number of observations (n) in the experiment and a predetermined probability for rejecting the null hypothesis. The probability used for rejection is called the significance level.

It's very important to realize that statistical testing is based on probabilities, which means errors can occur. For instance, a true null hypothesis will occasionally be tested and determined to be false, which of course means we have committed an error. For this type of error, if our significance level is 5 percent, we will expect it to occur 5 percent of the time, on average. We are willing to accept this because we have some idea of the probability of it happening, at least in theory, but we never really know when it actually happens! You might be thinking, "Why not lower the significance level?" to reduce the chance of committing this type of error (incorrectly rejecting a true null hypothesis). Good thinking, unfortunately there is a second type of error that we must consider.

The second type of error we can make is not rejecting, or accepting, the null hypothesis when it is in fact false. This type of error has a different probability of occurrence, compared to the first type, but it's more difficult to quantify. In fact, we generally neither specify nor know how often it occurs. What we do know is that for a given sample size, n, the probability of committing the first type of error (rejecting a true null hypothesis) is inversely related to the probability of committing this second type of error (accepting a false null hypothesis). That is, lower probabilities of committing the first type of error are associated with higher probabilities of committing the second type of error, and the only way to reduce both types of error simultaneously is to increase the number of observations, or the number of replications, in the experiment.

To sum up, be leery of results from any "experiment" where there is no replication of treatments and/or the number of observations is very small. Consider carefully the circumstances and environmental conditions from which the results were obtained. Applying results to widely differing conditions is never advisable. This is all quite deep, but I hope you can see that the scientific method only allows us to disprove hypotheses, not prove them. In this sense, you can see that science really doesn't prove anything!