GEOG 442

Biogeography

Lab 5: Chi-Squared Analysis of Your Lichen Field Data

==========

The purpose of this lab is to have you analyze the data you collected on lichens during your field observation in Charmlee Park. Chi-squared analysis will be used to see if there is a significant difference between the two field plots (the vertical rock face and the sunnier horizontal rock face) in terms of the quadrats reporting the three species of lichen (crustose, foliar, and fruticose or "mossy").

It would be helpful if you referred back to Lab 2, in which you first (re?)did a Chi-square dry run. If you are doing this lab in our Geography Lab, please be sure to bring an IBM-formatted 3.5" floppy diskette or a Zip disk -- wouldn't want Woody to purge your half-done lab inadvertantly!

==========

Lab 5a: Getting and Preparing Your Data for Analysis

For your group (vertical, shady rock wall or horizontal, sunny rock surface), please organize your field data into three generously proportioned Chi-square tables:

For each of the four data cells in each of the three tables, please count the number of quadrats that satisfy the "+/-" combinations above (how many quadrats had Crustose present but Foliar absent, for example). Each of the three tables should show total quadrat counts of 100 (10 columns on your PVC pipe sampling frame by 10 rows). If they don't, someone made a mistake and you need to re-examine your classification until you have 100 quadrats accounted for. When you are satisfied that you have all three quadrat counts done properly (they all add up to 100 quadrats), then enter those counts into the upper parts of the appropriate cells as your observed frequencies.

You should have three tables, then, that look kind of like this:


     VERTICAL or HORIZONTAL 

                      |          CRUSTOSE 
                      |              |              |    row
                      |    present   |    absent    |   totals
     ________________________________________________________________
                      |(a)           |(b)           |-e-
             present  | obs =        | obs =        |
                      | exp =        | exp =        |
     FOLIAR      ____________________________________________________
                      |(c)           |(d)           |-f-
              absent  | obs =        | obs =        |
                      | exp =        | exp =        |
     ________________________________________________________________
                      |-g-           |-h-           |-n-  100
     column totals    |              |              |
                      |              |              |



     VERTICAL or HORIZONTAL 

                      |          CRUSTOSE 
                      |              |              |    row
                      |    present   |    absent    |   totals
     ________________________________________________________________
                      |(a)           |(b)           |-e-
             present  | obs =        | obs =        |
                      | exp =        | exp =        |
     MOSSY       ____________________________________________________
                      |(c)           |(d)           |-f-
              absent  | obs =        | obs =        |
                      | exp =        | exp =        |
     ________________________________________________________________
                      |-g-           |-h-           |-n-  100
     column totals    |              |              |
                      |              |              |



     VERTICAL or HORIZONTAL 

                      |           FOLIAR 
                      |              |              |    row
                      |    present   |    absent    |   totals
     ________________________________________________________________
                      |(a)           |(b)           |-e-
             present  | obs =        | obs =        |
                      | exp =        | exp =        |
     MOSSY       ____________________________________________________
                      |(c)           |(d)           |-f-
              absent  | obs =        | obs =        |
                      | exp =        | exp =        |
     ________________________________________________________________
                      |-g-           |-h-           |-n-  100
     column totals    |              |              |
                      |              |              |



Lab 5b: Your Hypotheses

Examining your three tables, do you think you see a relationship or association between any pairs of species? That is, do you suspect foliar lichens are generally found with crustose lichens? For each of the three, please state, first, your hunch about the association between the species and, then, the null hypothesis (or inverse of your hunch).
Crustose and foliar lichens working hypothesis:

     _________________________________________________________________________

     _________________________________________________________________________

     _________________________________________________________________________


Crustose and foliar lichens null hypothesis:

     _________________________________________________________________________

     _________________________________________________________________________

     _________________________________________________________________________


Crustose and mossy lichens working hypothesis:

     _________________________________________________________________________

     _________________________________________________________________________

     _________________________________________________________________________


Crustose and mossy lichens null hypothesis:

     _________________________________________________________________________

     _________________________________________________________________________

     _________________________________________________________________________


Foliar and mossy lichens working hypothesis:

     _________________________________________________________________________

     _________________________________________________________________________

     _________________________________________________________________________


Foliar and mossy lichens null hypothesis:

     _________________________________________________________________________

     _________________________________________________________________________

     _________________________________________________________________________

We will reject the null hypothesis if our results are so extreme that there is no more than a five percent chance that we could have gotten them by pure random luck-of-the-draw in placing those quadrat sampling frames. That is (stats refesher), we will use the 0.05 alpha standard. Another way of looking at it is that, by using such an extreme standard, we can have a 95 percent confidence in our conclusion, should we wind up rejecting the null hypotheses and deciding that the associations between each pair of lichen "species" is not random, that there is some sort of relationship between them.

==========

Lab 5c: Doing the Analysis

  1. Compute the marginal totals. That is, sum the observed frequencies in each row and put those sums in the appropriate row total (e or f). Do the same for the frequencies in each column and put those sums in the appropriate column total (g or h). The sum of row totals should equal the sum of column totals. Check that the total number equals 100) in cell n.

  2. Then, create the "expected frequencies" for each data cell (a through d). This is the distribution of cell counts you would expect from your data if there were no association between the two plant species (i.e., random processes were allocating them among the cells). To do this for each data cell, a through d, multiply the row total to its right by the column total below it and then divide the answer by n. Put the answer, rounded to three decimal places of accuracy, in its cell below the actual observed frequency.

    Still lost? Okay, okay. In other words, multiply cells e and g and divide the answer by 100. Put the answer, properly rounded, in the lower part of cell a. Similarly, multiply cells e and h and divide by 100, and put that answer in cell b. And so forth.

  3. That done, examine the expected frequencies to make sure you can properly proceed. Chi-square should not be used if any expected frequencies are below 2 (or, irrelevantly in this case, if more than 20 percent of the data cells have fewer than 5 actual cases). You will note that there are no such problems with your contingency table, so you can safely proceed through Chi-square.

  4. Now, move on to the worksheet below for calculating Chi-squared. In the first column, enter the observed frequencies for each data cell (the number in the upper, obs = part of cells a through d).

  5. In the second column, square those OBS frequencies.

  6. In the third column, divide each squared observed frequency by the corresponding expected frequency in the bottom of the appropriate data cell (a through d).

  7. Now, sum the third column and put the answer near the bottom of the spreadsheet (sum(O2/E). Show your work here to three decimal places of accuracy.

  8. Finally, subtract n (or 100) from that sum. This answer is your calculated Chi-squared (X2). Put it at the bottom of the whole spreadsheet, also rounded to three decimal places of accuracy.
         ________________________________________________________________________
          CRUSTOSE VS. FOLIAR
    
          DATA CELL |     O     |       O2       |               O2/E
         ________________________________________________________________________
            (a)    |           |                |
         ________________________________________________________________________     
            (b)    |           |                |
         ________________________________________________________________________
            (c)    |           |                |
         ________________________________________________________________________
            (d)    |           |                |
         ________________________________________________________________________
    
                                                | sum(O2/E) = 
         ________________________________________________________________________
                               |         sum(O2/E) - n =           
                                         X2 =          
         ________________________________________________________________________
    
    
    
         ________________________________________________________________________
          CRUSTOSE VS. MOSSY
    
          DATA CELL |     O     |       O2       |               O2/E
         ________________________________________________________________________
            (a)    |           |                |
         ________________________________________________________________________     
            (b)    |           |                |
         ________________________________________________________________________
            (c)    |           |                |
         ________________________________________________________________________
            (d)    |           |                |
         ________________________________________________________________________
    
                                                | sum(O2/E) = 
         ________________________________________________________________________
                               |         sum(O2/E) - n =           
                                         X2 =          
         ________________________________________________________________________
    
    
    
         ________________________________________________________________________
          FOLIAR VS. MOSSY
    
          DATA CELL |     O     |       O2       |               O2/E
         ________________________________________________________________________
            (a)    |           |                |
         ________________________________________________________________________     
            (b)    |           |                |
         ________________________________________________________________________
            (c)    |           |                |
         ________________________________________________________________________
            (d)    |           |                |
         ________________________________________________________________________
                               |         sum(O2/E) - n =           
                                         X2 =          
         ________________________________________________________________________
    
    
    
    
    
  9. Now, to interpret this X2calc, you need to compare it with a critical X2. To do this, you will need the Chi-squared table in Figure 1. You need your pre-selected alpha level to pick the right column and the degrees of freedom for your 2 x 2 contingency table to choose the right row to enter the table. Degrees of freedom in Chi-squared can be defined as:
         DF = (r - 1)(k - 1)
         where r = number of rows with obs data in them and 
               k = number of columns with obs data in them
    
    
    So, you will enter the table at the intersection of:
         the column headed ________ 
    
         and the row corresponding to ________ degrees of freedom.
    
    What, then, is your critical Chi-squared value for the association between crustose and foliar lichens?
         X2crit =  ________
    
    
    What is the critical Chi-squared value for the association between crustose and mossy lichens?
         X2crit =  ________
    
    
    What is the critical Chi-squared value for the association between foliar and mossy lichens?
         X2crit =  ________
    
    
  10. Is your X2calc ________ greater than or ________ less than the X2crit for the crustose and foliar lichen association?

    Is your X2calc ________ greater than or ________ less than the X2crit for the crustose and mossy lichen association?

    Is your X2calc ________ greater than or ________ less than the X2crit for the foliar and mossy lichen association?

  11. If your actual, calculated Chi-square value is greater than the critical Chi-square, you may safely conclude that your pattern is not just a random one. In other words, there is a statistically significant probability that there is a real association of some sort between your variables (in this case, between species). If the calculated Chi-square value is less than the critical test value, the relationship probably is random.

    Can the null hypothesis of random association between crustose and foliar lichens in this study area be rejected in this case?

         _____ reject Ho          _____ do not reject Ho
    
    

    Can the null hypothesis of random association between crustose and mossy lichens in this study area be rejected in this case?

         _____ reject Ho          _____ do not reject Ho
    
    

    Can the null hypothesis of random association between foliar and mossy lichens in this study area be rejected in this case?

         _____ reject Ho          _____ do not reject Ho
    
    

  12. It's always good etiquette, whenever possible, to calculate the prob-value of a Type I error, to express your faith in the null hypothesis, however, in the off chance that a reader may have compelling reasons to use a different standard of alpha than you chose. I have provided you the needed data in Figure 2 to tell the probability that you could have gotten results as extreme as yours if there is but a random association between each pair of species.
         ________ prob-value of Ho for crustose and foliar association
    
         ________ prob-value of Ho for crustose and mossy association
    
         ________ prob-value of Ho for foliar and mossy association
    
    

  13. As you may remember from your earlier encounter with Chi-square, the results of the technique are affected by sample size. Chi-square can tell you IF there is a significant association but it's less solid about telling you the STRENGTH of that association. It's a good idea to calculate a measure of strength as well, which, for a 2 x 2 table, could be Yule's Q.

    To calculate Yule's Q, multiply data cells a and d and also cells b and c. Then, enter these multiplications into the following formula:

              ad - bc
         Q =  _______
              ad + bc
    
    

    So, what is the Q value for the relationship between crustose and foliar lichens? ________

    What is the Q value for the relationship between crustose and mossy lichens? ________

    What is the Q value for the relationship between foliar and mossy lichens? ________

    Remember, Yule's Q can vary from -1 to +1. The closer it is to 0, the weaker the relationship is. The closer it is to -1 or +1, the stronger the relationship is, whether inverse (negative) or direct (positive).

    ==========

    Lab 5d: So, What Does It All Mean?

    Please describe the results of your field data collection and lab analysis. Are there any significant associations between any pair of lichen species? If so, what is the direction and strength of that association?

         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
         _________________________________________________________________________
    
    
    
    ==========

    Figure 1 Critical Values for Chi-Square (X2crit)

        
                          alpha                
    
     df     0.100     0.050     0.025     0.010     0.005
                                                 
      1     2.706     3.841     5.024     6.635     7.879
      2     4.605     5.991     7.378     9.210    10.597
      3     6.251     7.815     9.348    11.345    12.838
      4     7.779     9.488    11.143    13.277    14.860
      5     9.236    11.070    12.832    15.086    16.750
      6    10.645    12.592    14.449    16.812    18.548
      7    12.017    14.067    16.013    18.475    20.278
      8    13.362    15.507    17.535    20.090    21.955
      9    14.684    16.919    19.023    21.666    23.589
     10    15.987    18.307    20.483    23.209    25.188
     11    17.275    19.675    21.920    24.725    26.757
     12    18.549    21.026    23.337    26.217    28.300
     13    19.812    22.362    24.736    27.688    29.819
     14    21.064    23.685    26.119    29.141    31.319
     15    22.307    24.996    27.488    30.578    32.801
     16    23.542    26.296    28.845    32.000    34.267
     17    24.769    27.587    30.191    33.409    35.718
     18    25.989    28.869    31.526    34.805    37.156
     19    27.204    30.144    32.852    36.191    38.582
     20    28.412    31.410    34.170    37.566    39.997
     21    29.615    32.671    35.479    38.932    41.401
     22    30.813    33.924    36.781    40.289    42.796
     23    32.007    35.172    38.076    41.638    44.181
     24    33.196    36.415    39.364    42.980    45.558
     25    34.382    37.652    40.646    44.314    46.928
     26    35.563    38.885    41.923    45.642    48.290
     27    36.741    40.113    43.195    46.963    49.645
     28    37.916    41.337    44.461    48.278    50.994
     29    39.087    42.557    45.722    49.588    52.335
     30    40.256    43.773    46.979    50.892    53.672
     40    51.805    55.758    59.342    63.691    66.766
     50    63.167    67.505    71.420    76.154    79.490
     60    74.397    79.082    83.298    88.379    91.952
     70    85.527    90.531    95.023   100.425   104.215
     80    96.578   101.879   106.629   112.329   116.321
     90   107.565   113.145   118.136   124.116   128.299
    100   118.498   124.342   129.561   135.807   140.170
                                                                                    
    
    ==========

    Figure 2: p-Values for X2calc

         X2    1 DF       X2    1 DF       X2    
    1 DF        X2    1 DF
    
        3.2   .0736      4.4   .0359      5.6   .0180      6.8   .0091
        3.3   .0692      4.5   .0339      5.7   .0170      6.9   .0086
        3.4   .0652      4.6   .0320      5.8   .0160      7.0   .0082
        3.5   .0614      4.7   .0302      5.9   .0151      7.1   .0077
        3.6   .0578      4.8   .0285      6.0   .0143      7.2   .0073
        3.7   .0544      4.9   .0268      6.1   .0135      7.3   .0669
        3.8   .0513      5.0   .0254      6.2   .0128      7.4   .0065
        3.9   .0483      5.1   .0239      6.3   .0121      7.5   .0062
        4.0   .0455      5.2   .0226      6.4   .0114      7.6   .0058
        4.1   .0429      5.3   .0213      6.5   .0108      7.7   .0055
        4.2   .0404      5.4   .0201      6.6   .0102      7.8   .0052
        4.3   .0381      5.5   .0190      6.7   .0096     >7.8  <.0050
    
    
    

    ==========

    first placed on the web: 11/27/01
    last revised: 12/06/03
    © Dr. Christine M. Rodrigue

    ==========