jroese@lssu.edu
TESTS OF HYPOTHESES
Tests of hypotheses are a formalized method of stating a question in an unambiguous way. In a properly constructed
set of hypotheses every possible outcome; 'equal', 'greater than', and 'less than' (the latter are often combined as 'not
equal') is accounted for. Typically the outcome of interest is referred to as the research or alternative hypothesis and
is symbolized as HA. The remaining outcomes (always including the equality) constitute the null hypothesis noted as
H0. As always in science, when we perform a test of hypotheses, we never prove anything; rather, after conducting
the test, we either 'reject the null hypothesis' or 'fail to reject the null hypothesis'.
Making the Wrong Conclusion
It is essential to remember that statistical analysis is based on probabilities. Unfortunately, this always includes some
probability of being wrong. When you base a conclusion on the result of a test of hypotheses, you need to know the
probability of making an incorrect decision. There are two ways in which this could happen; 1) you could reject the
null hypothesis when it is actually true (rejection error, Type I error, a-error), or 2) we could fail to reject the null
hypothesis when it is actually false (acceptance error, Type II error, b-error). Traditionally, we are most concerned
with the Type I error, and decisions of statistical significance are typically based on controlling the probability of
making this type of error.
Alpha (a) Levels and p-values
As scientists, we must learn to accept the possibility that we may make incorrect decisions based on our analysis of
sample data. We would, however, like to keep the probability of being wrong as small as possible. One way in which
this is accomplished is to predetermine the level of risk (i.e. probability of being wrong) we are willing to accept. As
mentioned above, the error we wish to minimize is the possibility of rejecting a null hypothesis that is actually true (a-
error). For this reason, the level of risk we define as acceptable is referred to as the alpha level. A typical value for
alpha is 0.05. In other words, we are willing to take a 5% chance of making a Type I error. (It should be noted that
studies dealing with drugs or other health related topics often set alpha to 0.01 or even 0.001).
The purpose of conducting a statistical analysis on a sample is to determine the actual probability of making a Type I
error. This probability is called the p-value. If your analysis is completed using a computer program, the software will
most likely report an exact value for p. If, on the other hand, you conduct the analysis by hand, you will need to look
up an approximate p-value from a table (the table used will depend on the particular analysis conducted). If the
calculated probability of making a Type I error (p) is less than the probability of making this error that you are willing
to accept (alpha), you should reject the null hypothesis.
One-Sided and Two-Sided Tests of Hypotheses
It was mentioned earlier that in a properly constructed set of hypotheses, every possible outcome is accounted for,
and that the null hypothesis always includes the equality. Given these constraints it is possible to construct three
different combinations of null and alternative hypotheses.
The first example is referred to as a two-sided test. This would be appropriate when we have no a priori reason to
suspect a difference in a particular direction and are interested in any difference that exists between the two
populations. It also means that the possibility exists of concluding, incorrectly, that m1 < m2 or that m1 > m2. We
therefore split the risk of making a Type I error equally between both possibilities of being wrong.
The latter two examples are referred to as one-sided tests. These would be appropriate when we have some a priori
reason to suspect a difference in only one direction. we therefore limit the direction in which we can make a Type I
error and retain an undivided alpha in that direction.