Geography 200: INTRODUCTION TO RESEARCH METHODS

Dr. Rodrigue

Graded Lab 7: Hypothesis Testing with Two-Sample Z and t Tests and ANOVA

==========

This lab expands on the comparison of means idea that you worked on for Lab 6. That time, you compared one sample against a population. That's not too common a research problem, however. Much more common is trying to compare one sample against another to see if they plausibly could have come from the same population (about which you know nothing). So, this lab introduces two sample difference of means and difference of proportions tests. It also introduces the Analysis of Variance (or ANOVA) as a means of comparing more than two samples' means.

For all questions, please do your calculations at the full capacity of your spreadsheet or calculator, rounding your final answers to two decimal places of accuracy (i.e., 0.00).

LAB EXERCISE A: Two Sample Test of Means

A geomorphologist, you have devised some valid sampling strategy based on transects taken down several slopes, each dominated by either limestone substrates or clastic sedimentary substrates. At each point along each of your transects, you measured slope angle in degrees from the horizontal. Then, for each transect, you calculated average slope angles. You are testing the null hypothesis that there is no significant difference between slope angles formed on limestone and clastic sedimentary substrates in your central California location. You have set your confidence level at .95 (alpha, then, is 0.05). Here are your data:

     SLOPE ANGLE BY SUBSTRATE

          Limestone       Clastic Sedimentary

               32.1                 17.8
               29.4                 15.8
               33.0                 12.5
               27.3                 15.5
               19.0                 15.1
               14.4                 12.2
               21.1                 13.1
               25.5                 10.6
                9.1                  9.3
               10.5                  5.5 
               10.5 
               11.0
               14.2

  1. Given the form of your null hypothesis, are you about to do a one-tail or two-tail test?
         _____ 1 tail          _____ 2 tail
    
  2. Considering that you don't have population variances for slope angles in each substrate type and that you only have 13 and 10 transects for limestone and clastic sedimentary substrates, respectively, which test should you use, Z test or t test?
         _____  Z test         _____ t test
    
    
  3. What are the means for transects on:
         a.  the  limestone substrate?           __________     
    
         b.  clastic sedimentary substrate?      __________
    
  4. What are the standard deviations for transects on:
         a.  the  limestone substrate?           __________
    
         b.  clastic sedimentary substrate?      __________
    
  5. What, then, are the variances for transects on:
         a.  the  limestone substrate?           __________
    
         b.  clastic sedimentary substrate?      __________
    
  6. Now, eyeballing the standard deviations above, are they similar enough to warrant a pooled estimate of the population variance (PVE), or should you use the separate variance estimate (SVE)?
                   _____  PVE                     _____ SVE
    
  7. Having made your decision about the PVE or SVE choice, go on ahead and calculate the 2 sample test statistic for the difference of means.
                                                       ___________________________
    
  8. Now, using the right table the right way, what is the critical statistic for your samples?
                                                    ______________________________
    
  9. Briefly interpret your findings in clear English:
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
    

==========

LAB EXERCISE B: Two Sample Test of Proportions

The Arabian horse is a small breed of very refined riding horses. American breeders (and foreign breeders marketing to American buyers) are generally trying to use selection to breed larger animals for larger American riders, without losing the refinement and general appearance of the breed. A given animal's phenotype (appearance, including size) is governed partly by its genotype (genetic inheritance, including genes for size and inbreeding effects) and partly by a wide range of environmental variables. Most Americans interested in the breed want Arabian horses at least 150.0 cm (~15.0 "hands") tall. The question is do you breed the smaller animals if their height is partly a function of environment?

There is a belief among horsepeople that pastures on limestone soils provide an ideal balance of calcium to phosphorous, and foals raised on them grow into larger, more robust adults. This lab exercise will evaluate the effects of soil substrate on samples of genetically very closely related animals. So, imagine that the Arabian Horse Association has drawn out stratified random samples of purebred Arabian mares whose pedigrees show them to be all from the same lineages. The AHA, in this imaginary study, has drawn out 250 animals from the Santa Ynez Valley of California and from the Bluegrass area of Kentucky, areas with limestone-based soils. It has also drawn out another 250 from Scottsdale, AZ, and the Antelope Valley of California, regions characterized by desert soils based on granites. As the consultant doing the data analysis, you have decided to set your alpha criterion at 0.05.

After contacting the owners and getting them to measure the animals' height at the withers (base of the neck), the Registry, in this made-up example, got 217 height records from the limestone regions and 205 from the desert regions. Of the 217 animals from the limestone areas, 186 proved to be at least 150.0 cm tall; of the 205 from the granite areas, 145 were at least 150.0 cm tall.

  1. Given the discussion above, will you be doing a one-tail test or a two- tail test?
         __________  1-tail        __________ 2-tail  
    
    
  2. What proportion (not percentage) of horses from limestone country meet or exceed the 150.0 cm standard?
                                   __________
    
  3. What proportion (not percentage) of horses from granite country meet or exceed the 150.0 cm standard?
                                   __________
    
  4. Given the size of the samples and the randomized way in which they were developed, you can justify using the Z test instead of the t test. Using the following simple, calculational formula for the Z test, calculate Z:
    
                       p1   -   p2
              Z =   _______________________
                    _______________________
                   / p1(1 - p1) + p2(1 - p2) 
                  /  _________    _________                       
                \/      n1            n2
    
    
    
    What is your calculated Z for the difference of proportions? __________

  5. What is the critical Z for the chosen alpha level and number of tails in the test?
         __________
    
    
  6. Do you, then, accept the null hypothesis or reject it?
         __________  accept        __________ reject  
    
    

  7. What is the probability that random sampling could have produced results as extreme as these?
         __________ prob-value
    
  8. Briefly interpret your findings in clear English:
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
    

==========

LAB EXERCISE C: Analysis of Variance (ANOVA)

ANOVA is a technique used when you have interval or ratio measurements for three or more different categories. In other words, one axis of your test is measured at the scalar level, and the other is measured at the nominal level. You can do ANOVA for two categories, but that converges on the plain old t-test for the difference of means, which is a lot easier to do.

Back to Arabian horses. In this scenario, YOU are the fabulously wealthy owner of three ranches, one in the Santa Ynez Valley, one in Scottsdale, and one in the Santa Monica Mountains of Southern California. You are trying to decide what sort of criterion you will use to exclude mares (adult female horses) from your broodmare bands, factoring in genetics and things affecting the animals' feed.

The following are the heights at withers of simple random samples of each of your herds of Arabian mares, all three herds being derived from the same original foundation stock (genetically very closely related). The three herds were raised in these very different places, and the animals sampled in each herd are those who actually grew up on your farms. One of the farms is in the Santa Ynez Valley, that area with a lot of limestone as parent material for the soils on which the horses are pastured. The second farm is in the Scottsdale area, and the pastures on that farm happen to be on soils derived from granites. The animals on these two farms are fed entirely from irrigated pasture. The third farm, in the Santa Monica Mountains of Southern California, feeds the animals alfalfa and grass hays from the Antelope Valley in the Mojave Desert (again, a lot of granite-related soils) and supplemental grains. The animals on the first two farms are fed entirely from irrigated pasture, while the third herd depends entirely on hay and supplements.

As a show breeder, you are part of the movement trying to use selection to make the small Arabian horse breed produce larger individuals more suited to large American riders. Your culling standard is to sell off any young adult mares smaller than 150.0 cm (about 15.0 "hands") without breeding them, on the theory that their small size is largely genetically determined. You are aware of the possibility that this standard might be affected by the soil substrate of these three farms and wonder whether you ought to adjust your culling standard for each farm if there are significant environmental effects on the animals' adult sizes.

There is also a concern that overfeeding with grains skews that balance and, while producing fast-growing animals, overnutrition tends to create tendon and ligament problems that reduce the soundness of adolescent and adult horses. So, you get to use ANOVA on this quintessentially geographic problem: Can you see the effects of pasture/food location on the animals' phenotypes? The null hypothesis, of course, is that there is no significant difference among the three samples, at the 0.05 alpha level (or .95 confidence level). If there is no significant difference at the .05 level, you will apply your breeding standard to all three farms; if the difference is significant, you will resign yourself to using a different standard on each farm and a lot of record-keeping hassle.

Here are your data:


     Height at withers, adult mares in cm:

          Santa Ynez            Scottsdale          Santa Monicas 
              sample                sample                 sample
          -------------------------------------------------------
               147.5                 142.5                  137.5 
               152.5                 150.0                  152.5
               147.5                 140.0                  147.5
               157.5                 142.5                  145.0
               155.0                 145.0                  150.0
               157.5                 137.5                  152.5
               150.0                 142.5                  150.0

  1. Calculate the group means for each of the three herd samples:
         Santa Ynez ________   Scottsdale ________   Santa Monicas ________ 
    
    
  2. Calculate the grand total weighted mean for all 21 animals. You do this by multiplying each sample's mean by the number of horses in that sample and then adding these three products together. Then, divide the summed products by the total number of animals in all three samples. Here's the formula, if the description didn't make sense:
               k                           
               __   _                             
               \  niXi 
               /__ 
         _     i=1                           
         XT = _________  =  __________ 
                  N
    
         Where: _
                Xi = mean height of sample i
    
                ni = number of mares in sample i
                k  = number of samples (3)
                N  = total number of mares in all samples
                XT = weighted total mean height for all mares in all 
    k samples
    
    

  3. Calculate the variances (careful: not the standard deviations) for each of the three herd samples (si2):
         Santa Ynez ________   Scottsdale ________   Santa Monicas ________ 
    
    
    
    Now, start calculating the between-group sum of squares (SSB). This is a process that will take us four steps (questions #4-7) to do. Here is an overview. To calculate the between-group sum of squares, take each herd sample's mean and square it. Then, for each herd sample, multiply the answer by the number of horses in each sample. Sum the three products. Then, square the weighted total mean and multiply it by the total number of horses in all three samples (21). Subtract that product from the sum of products for the three samples. Here's the formula:
                /         \
                |  k      |                        
                | __   _  |      _                 
         SSB  = | \  niXi2 |  -  NXT2
                | /__     |                       
                | i=1     |                       
                \         /
    
    
  4. So, to start calculating the between-group sum of squares, square each herd's mean and multiply that squared mean by the number of horses in the sample (7):
         Santa Ynez ________   Scottsdale ________   Santa Monicas ________ 
    
    
  5. Continue by summing the three answers above: ___________

  6. Now, take the overall, grand mean you did in #2, square it, and then multiply it by the grand total number of horses in the three herd samples taken together.
          _
         NXT2  =  ________
           
    
  7. That done, subtract this answer (to #6) from the sum calculated in #5:
         SSB = ________
    
    
    
  8. Calculate between-group degrees of freedom (DFB). This is the number of herd samples minus one (k - 1): ________

  9. At this point, you are in a position to calculate the between-group mean squares or between-group variance (MSB). Divide the SSB by the DFB:
         MSB = SSB/DFB = ________
    
    
    Now you can calculate the within-group sum of squares (SSW), which is a two-step process that entails working out the following:
               k 
               __
         SSW = \  (ni - 1)si2
               /__
               i=1
    
    
  10. For each herd sample, first multiply its variance by the sample size minus one {(ni - 1)si2}:
         Santa Ynez ________   Scottsdale ________   Santa Monicas ________ 
    
    
  11. Then, sum these three to get the SSW: ___________

  12. Calculate the within-groups degrees of freedom (DFW). This is defined as the grand total number of animals in all three samples taken together (N) minus the number of samples (k).
         DFW = N - k = ________
    
    
  13. Let's move on to doing the within-group mean squares (MSW) or within-group variance. This is the within-group sum of squares divided by the within-group degrees of freedom.
         MSW = SSW/DFW = SSW/(N-k) = 
    ________
    
    
  14. Now, at long last, calculate the F ratio (the test statistic for ANOVA). Fcalc = between-group variance (or MSB) divided by within-group variance (or MSW)
                   MSB
         Fcalc = _______ = __________
                   MSW
    
    
  15. The Fcalc tells you the extent to which between-group variance exceeds within-group variance (presumably, the F ratio you calculated will be greater than one). What you need to do now is decide if the ratio is significantly greater than one.

    To do this, turn to the critical F ratio table, which you'll enter based on your alpha level decision before you started the problem. Your textbook has an F table on pp. 613-619. To get into it, you need to assign DF1 and DF2 to the appropriate variance. Remember DF1 is the degrees of freedom associated with the larger of the two variances (or mean squares): within-group variance (MSW) OR between-group variance (MSB). Re-examining your answers to questions 9 and 13, check which of the two variances, or mean squares, is the larger:

         _____ within-group variance (MSW)    _____ between-group variance (MSB)
    
    
  16. How many degrees of freedom (DF1) are associated with the larger variance?
         __________
    
  17. What are the degrees of freedom (DF2) associated with the smaller variance?
         __________
    
  18. Now that you know which DF is which, extract the critical F ratio from the proper table. Make sure you have the table corresponding with 0.05, then look up DF1 across the top and DF2 down the left and report the critical statistic at their intersection.
         Fcrit =  _______________ 
    
  19. So, do you reject the null hypothesis or not reject it?
         _____ reject       _____ not reject
    
    
  20. Briefly interpret your results in English, making sure to cover what you should do about your culling standards, given your breeding goals, in light of your decision about the fate of the null hypothesis:
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
    

==========

first placed on the web 11/16/98
last revised: 11/01/10
© Dr. Christine M. Rodrigue

==========