Geography 215: QUANTITATIVE METHODS

Dr. Rodrigue

Graded Lab 7: Bi-Variate Hypothesis Testing with Interval and Ratio Data


For all questions, please do your calculations at the full capacity of your spreadsheet or calculator, but round your answers to two decimal places of accuracy (i.e., 0.00).

LAB EXERCISE A: Two Sample Test of Means

A geomorphologist, you have devised some valid sampling strategy based on transects taken down several slopes, each dominated by either limestone substrates or clastic sedimentary substrates. At each point along each of your transects, you measured slope angle in degrees from the horizontal. Then, for each transect, you calculated average slope angles. You are testing the null hypothesis that there is no significant difference between slope angles formed on limestone and clastic sedimentary substrates in your central California location. You have set your confidence level at .95 (alpha, then, is .05). Here are your data:

     SLOPE ANGLE BY SUBSTRATE

          Limestone       Clastic Sedimentary

               32.1                 17.8
               29.4                 15.8
               33.0                 12.5
               27.3                 15.5
               19.0                 15.1
               14.4                 12.2
               21.1                 13.1
               25.5                 10.6
                9.1                  9.3
               10.5                  5.5 
               10.5 
               11.0
               14.2

  1. Given the form of your null hypothesis, are you about to do a one-tail or two-tail test?
         _____ 1 tail          _____ 2 tail
    
  2. Considering that you don't have population variances for slope angles in each substrate type and that you have 13 and 10 transects for limestone and clastic sedimentary substrates, respectively, which test should you use, Z test or t test?
         _____  Z test         _____ t test
    
  3. What are the means for transects on:
         a.  the  limestone substrate?           __________     
    
         b.  clastic sedimentary substrate?      __________
    
  4. What are the standard deviations for transects on:
         a.  the  limestone substrate?           __________
    
         b.  clastic sedimentary substrate?      __________
    
  5. What, then, are the variances for transects on:
         a.  the  limestone substrate?           __________
    
         b.  clastic sedimentary substrate?      __________
    
  6. Now, eyeballing the standard deviations above, are they similar enough to warrant a pooled estimate of the population variance (PVE), or should you use the separate variance estimate (SVE)?
                   _____  PVE                     _____ SVE
    
  7. Having made your decision about the PVE or SVE choice, go on ahead and calculate the 2 sample test statistic for the difference of means.
                                                       ___________________________
    
  8. Now, using the right table the right way, what is the critical test statistic for your samples?
                                                    ______________________________
    
  9. Briefly interpret your findings in clear English:
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
    


LAB EXERCISE B: Two Sample Test of Proportions

The Arabian horses is a small breed of very refined riding horses. American breeders (and foreign breeders marketing to American buyers) are generally trying to use selection to breed larger animals for larger American riders, without losing the refinement and general appearance of the breed. A given animal's phenotype (appearance, including size) is governed partly by its genotype (genetic inheritance, including genes for size and inbreeding effects) and partly by a wide range of environmental variables. Most Americans interested in the breed want Arabian horses at least 1500 cm (15.0 "hands) tall. The question is do you breed the smaller animals if their height is partly a function of environment?

There is a belief among horsepeople that pastures on limestone soils provide an ideal balance of calcium to phosphorous, and foals raised on them grow into larger, more robust adults. This lab exercise will evaluate the effects of soil substrate on samples of genetically very closely related animals. So, imagine that the Arabian Horse Registry of America has drawn out stratified random samples of purebred Arabian mares whose pedigrees show them to be all from the same lineages. The AHRA, in this imaginary study, has drawn out 250 animals from the Santa Ynez Valley of California and from the Bluegrass area of Kentucky, areas with limestone-based soils. It has also drawn out another 250 from Scottsdale, AZ, and the Antelope Valley of California, regions characterized by desert soils based on granites.

After contacting the owners and getting them to measure the animals' height at the withers (base of the neck), the Registry, in this made-up example, got 217 height records from the limestone regions and 205 from the desert regions. Of the 217 animals from the limestone areas, 186 proved to be at least 1500 cm tall; of the 205 from the granite areas, 145 were at least 1500 cm tall.

  1. Given the discussion above, will you be doing a one-tail test or a two- tail test?
         __________  1-tail        __________ 2-tail  
    
    
  2. What proportion (not percentage) of horses from limestone country meet or exceed the 1500 cm standard?
                                   __________
    
  3. What proportion (not percentage) of horses from granite country meet or exceed the 1500 cm standard?
                                   __________
    
  4. M&M have an erroneous formula for the pooled estimate of the focal category in the population (9.12). Use the following formula for a Z test, then:
    
                       p1   -   p2
              Z =   _______________________
                    _______________________
                   / p1(1 - p1) + p2(1 - p2) 
                  /  _________    _________                       
                \/      n1            n2
    
    
    
    What is your calculated Z for the difference of proportions? __________

  5. What is the critical Z for the chosen alpha level and number of tails in the test?
         __________
    
    
  6. Do you, then, accept the null hypothesis or reject it?
         __________  accept        __________ reject  
    
    

  7. What is the probability that random sampling could have produced results as extreme as these?
         __________ prob-value
    
  8. Briefly interpret your findings in clear English:
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
    


LAB EXERCISE C: Analysis of Variance (ANOVA)

ANOVA is a technique used when you have interval or ratio measurements for three or more different categories. In other words, one axis of your test is measured at the interval or ratio level, and the other is measured at the nominal level. You can do ANOVA for two categories, but that converges on the plain old t-test for the difference of means, which is easier to do.

Back to Arabian horses. In this scenario, YOU are the wealthy owner of three ranches, one in the Santa Ynez Valley, one in Scottsdale, and one in the Santa Monica Mountains of Southern California. You are trying to decide what sort of criterion you will use to exclude mares from your broodmare bands, factoring in genetics and factors affecting the animals' feed.

The following are the heights at withers of simple random samples of each of your herds of Arabian mares, all three herds being derived from the same original foundation stock (genetically very closely related). The three herds were raised in these very different places, and the animals sampled in each herd are those who actually grew up on your farms. One of the farms is in the Santa Ynez Valley, that area with a lot of limestone as parent material for the soils on which the horses are pastured. The second farm is in the Scottsdale area, and the pastures on that farm happen to be on soils derived from granites. The animals on these two farms are fed entirely from irrigated pasture. The third farm, in the Santa Monica Mountains of Southern California, feeds the animals alfalfa and grass hays from the Antelope Valley in the Mojave Desert (again, a lot of granite-related soils) and supplemental grains. The animals on the first two farms are fed entirely from irrigated pasture, while the third herd depends entirely on hay and supplements.

As a show breeder, you are part of the movement trying to use selection to make the small Arabian horse breed produce larger individuals more suited to large American riders. Your culling standard is to sell off any young adult mares smaller than 1500 cm (about 15.0 "hands") without breeding them, on the theory that their small size is largely genetically determined. You are aware of the possibility that this standard might be affected by the soil substrate of these two farms and wonder whether you ought to adjust your culling standard for each farm if there are significant environmental effects on the animals' adult sizes.

There is also a concern that overfeeding with grains skews that balance and, while producing fast-growing animals, overnutrition tends to create tendon and ligament problems that reduce the soundness of adolescent and adult horses. So, you get to use ANOVA on this quintessentially geographic problem: Can you see the effects of pasture/food location on the animals' phenotypes? The null hypothesis, of course, is that there is no significant difference among the three samples, at the .05 alpha level (or .95 confidence level).

Here are your data:

     Height at withers, adult mares in cm:

          Santa Ynez sample     Scottsdale sample     Santa Monicas sample

               1475                  1425                   1375 
               1525                  1500                   1525
               1475                  1400                   1475
               1575                  1425                   1450
               1550                  1450                   1500
               1575                  1375                   1525
               1500                  1425                   1500

  1. Calculate the group means for each of the three herd samples:
         Santa Ynez ________   Scottsdale ________   Santa Monicas ________ 
    
    
  2. Calculate the grand total mean for all 21 animals: __________

  3. Calculate the variances (careful: not the standard deviations) for each of the three herd samples (si2):
         Santa Ynez ________   Scottsdale ________   Santa Monicas ________ 
    
    
  4. Now, start calculating the between-group sum of squares (SSb), which is described by M&M formula 9.24. To do this, take each herd sample's mean and square it. Then, for each herd sample, multiply the answer by the number of horses in each sample.
         Santa Ynez ________   Scottsdale ________   Santa Monicas ________ 
    
    
  5. Continue by summing the three answers above: ________

  6. Now, take the overall, grand mean you did in #2, square it, and then multiply it by the grand total number of horses in the three herd samples taken together.
          _
         NX*2  =  ________
           
    
  7. That done, subtract this answer (to #6) from the sum calculated in #5:
         SSb = ________
    
    
  8. Calculate between-group degrees of freedom (DFb). This is the number of herd samples minus one (k - 1): ________

  9. At this point, you are in a position to calculate the between-group mean squares or between-group variance (MSb). Divide the SSb by the DFb (M&M 9.25):
         MSb = SSb/DFb = ________
    
    
  10. Now, calculate the within-group sum of squares (SSw), as featured in M&M 9.26. For each herd sample, multiply its variance by the sample size minus one {(ni - 1)si2}:
         Santa Ynez ________   Scottsdale ________   Santa Monicas ________ 
    
    
  11. Sum these three to get the SSw: ________

  12. Calculate the within-groups degrees of freedom (DFw). This is defined as the grand total number of animals in all three samples taken together (N) minus the number of samples (k).
         DFw = N - k = ________
    
    
  13. Let's move on to doing the within-group mean squares (MSw) or within-group variance. This is the within-group sum of squares divided by the within-group degrees of freedom.
         MSw = SSw/DFw = ________
    
    
  14. Now, at long last, calculate the F ratio (the test statistic for ANOVA). Fcalc = between-group variance (or MSb) divided by within-group variance (or MSw)
         Fcalc   =  ________
    
    
  15. The Fcalc tells you the extent to which between-group variance exceeds within-group variance (presumably, the F ratio you calculated will be greater than one). What you need to do now is decide if the ratio is significantly greater than one.

    To do this, turn to the appropriate critical F ratio table (based on your alpha level decision before you started the problem). To get into it, you need to assign DF1 and DF2 to the appropriate variance. Remember DF1 is the degrees of freedom associated with the larger of the two variances (or mean squares): within-group variance (MSw) OR between-group variance (MSb). Re-examining your answers to questions 9 and 13, check which of the two variances, or mean squares, is the larger:

         _____ within-group variance (MSw)         _____ between-group variance (MSb)
    
    
  16. How many degrees of freedom (DF1) are associated with the larger variance?
         __________
    
  17. What are the degrees of freedom (DF2) associated with the smaller variance?
         __________
    
  18. Now that you know which DF is which, extract the critical F ratio from the proper table.
         Fcrit =  _______________ 
    
  19. So, do you reject the null hypothesis or not reject it?
         _____ reject       _____ not reject
    
    
  20. Briefly interpret your results in English, making sure to cover what you should do about your culling standards, given your breeding goals, in light of your decision about the fate of the null hypothesis:
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
    


first placed on the web 11/16/98
last revised: 11/26/98
© Dr. Christine M. Rodrigue