GEOG 442
Biogeography
Area Pattern Analysis for Biogeography
![]()
Quadrat-based techniques of area pattern analysis are commonly used in biogeography and ecology. They involve the division of a study area into equal-sized plots, usually through a grid of squares. This permits the use of statistical techniques to analyze quantitative data with no more measurement sophistication than mere frequencies by category (nominal data). The purpose of this lab is to introduce you to Chi-squared analysis, which is a popular approach to discerning relationships among plant species in a quadrat-based analysis.
For your reference pleasure, the definitional formula for Chi-squared is:
r k __ __ (Oij - Eij)2 X2 =\ \ ____________ /_ /_ Eij i=1 j=1You'll be comforted to know that I'll walk you through a fairly simple step-by-step approach to doing Chi-square.
![]()
Formulating Hypotheses
In statistical evaluation, we set up working and null hypotheses for testing purposes. So, eyeballing the map in Figure 1 below, formulate your hunch about the relationship between the distributions of the two plant species described below.
_________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________The problem for scientific reasoning is such a hunch cannot be tested directly. To create a testable hypothesis, you need to set up a null version of your hunch, or working hypothesis. That is, you need to express the reverse of your expectation. That way, if you reject this testable null hypothesis, the only logical conclusion is that your original hypothesis is the only viable one left. If this mystifies you, please review your statistics course notes or take a stats course. Mystified or not, please state the null version of your hypothesis:_________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________We will reject the null hypothesis if our results are so extreme that there is no more than a five percent chance that we could have gotten them by pure random luck-of-the-draw in developing our sample in the map below. That is (stats refesher), we will use the 0.05 alpha standard. Another way of looking at it is that, by using such an extreme standard, we can have a 95 percent confidence in our conclusion, should we wind up rejecting the null hypothesis and deciding that the association between these plants is not random.
![]()
The Data
Figure 1 shows the distribution of two plant species, Salvia apiana (white sage) and Avena barbata (slender oat). We can characterize each of the larger quadrats (the ones labeled A1 or F9 or J5, for example) as belonging to one of the four quadrat types listed below.
- (a) containing both Avena and Salvia;
- (b) containing Avena but no Salvia;
- (c) containing Salvia but no Avena; OR
- (d) containing neither Avena nor Salvia.
All 100 quadrats must be accounted for, each in no more than one category. Because I am such a nice person (and because so many of you may have done the grunt work on this lab in Geography 200 or 140 or a similar kind of lab in Biology 260), I'll present the data already conveniently preclassified for your statistical pleasure. These are your observed or real-world frequencies:
| SALVIA | | | | | present | absent | row totals _________________________________________________________________________ |(a) |(b) |-e- present | 33 | 15 | | | | AVENA _________________________________________________________________ |(c) |(d) |-f- absent | 49 | 3 | | | | _________________________________________________________________________ |-g- |-h- |-i- column totals | | | | | | n =
![]()
Doing the Analysis
- Compute the "marginal totals." That is, sum the observed frequencies in each row and put those sums in the appropriate row total (e or f). Do the same for the frequencies in each column and put those sums in the appropriate column total (g or h). The sum of row totals should equal the sum of column totals. If so, put the total number or n (which had better equal 100) in cell i.
- Then, create the "expected frequencies" for each data cell (a through d). This is the distribution of cell counts you would expect from your data if there were no association between the two plant species (i.e., random processes were allocating them among the cells willy-nilly). To do this for each data cell, a through d, multiply the row total to its right by the column total below it and then divide the answer by n. Put the answer, rounded to three decimal places of accuracy, in its cell below the actual observed frequency.
- Still lost? Okay, okay. In other words, multiply cells e and g and divide the answer by cell i. Put the answer, properly rounded, in the lower part of cell a. Similarly, multiply cells e and h and divide by i, and put that answer in cell b. Multiply cells f and g and divide by i, and plop that answer in cell c. Lastly, multiply cell f by cell h, divide by i again, and put the result in cell d.
- That done, examine the expected frequencies to make sure you can properly proceed. Chi-square should not be used if any expected frequencies are below 2 (or, irrelevantly in this case, if more than 20 percent of the data cells have fewer than 5 actual cases). You will note that there are no such problems with your contingency table, so you can safely proceed through Chi-square.
- Now, move on to the worksheet below for calculating Chi-squared. In the first column, enter the observed frequencies for each data cell (the number in the upper part of cells a through d).
- In the second column, square those frequencies.
- In the third column, divide each squared frequency by the corresponding expected frequency that you worked out in the bottom of the appropriate data cell (a through d).
- Now, sum the third column and put the answer near the bottom of the spreadsheet (sum(O2/E). Show your work here to three decimal places of accuracy.
- Finally, subtract n (from cell i) from that sum. This answer is your calculated Chi-squared (X2). Put it at the bottom of the whole spreadsheet, also rounded to three decimal places of accuracy.
________________________________________________________________________ DATA CELL | O | O2 | O2/E ________________________________________________________________________ (a) | | | ________________________________________________________________________ (b) | | | ________________________________________________________________________ (c) | | | ________________________________________________________________________ (d) | | | ________________________________________________________________________ | sum(O2/E) = ________________________________________________________________________ | sum(O2/E) - n = X2 = ________________________________________________________________________- Now, to interpret this hard-gained number, your X2calc, you need to compare it with a critical X2. To do this, you will need the Chi-squared table in Figure 2. You need your pre-selected alpha level to pick the right column and the degrees of freedom for your 2 x 2 contingency table to choose the right row to enter the table. Degrees of freedom in Chi-squared can be defined as:
DF = (r - 1)(k - 1) where r = number of rows and k = number of columnsSo, you will enter the table at the intersection of:the column headed ________ and the row corresponding to ________ degrees of freedom.What, then, is your critical Chi-squared value?X2crit = ________- Is your X2calc ________ greater than or ________ less than the X2crit?
- If your actual, calculated Chi-square value is greater than the critical Chi-square, you may safely conclude that your pattern is not just a random one. In other words, there is a statistically significant probability that there is a real association of some sort between your variables (in this case, between the two plant species). If the calculated Chi-square value is less than the critical test value, the relationship probably is random. Can the null hypothesis of random association between these two plant species in this study area be rejected in this case?
_____ reject Ho _____ do not reject Ho- It's always good etiquette, whenever possible, to calculate the prob-value of a Type I error, to express your faith in the null hypothesis, however, in the off chance that a reader may have compelling reasons to use a different standard of alpha than you chose. I have provided you the needed data in Figure 3 to tell the probability that you could have gotten results as extreme as yours if there is but a random association between the two plant species.
________ prob-value of Ho- Plot complication. Chi-squared is notoriously sensitive to sample size. That is, the same percentages in each cell can appear significant in a big sample (large n) or insignificant in a small sample. It might help to assess the strength of a significant relationship, should the Chi-squared test find one. For that, you can use Yule's Q. Yule's Q, however, can only be calculated for contingency tables with no more than two rows and two columns (bigger tables can sometimes be collapsed into a 2 x 2 format, by combining rows and columns in some sort of logical way). Conveniently, this lab just happens to feature a 2 x 2 table.
To calculate Yule's Q, multiply cells a and d and also cells b and c. Then, enter these multiplications into the following formula:
ad - bc Q = _______ ad + bcSo, what is the Q value for this lab? ________
- Now, what does it all MEAN? Basically, Yule's Q can vary from -1 to +1. The closer it is to 0, the weaker the relationship is. The closer it is to -1 or +1, the stronger the relationship is, whether inverse (negative) or direct (positive).
Please interpret the results of Lab B, taking into consideration both Chi-squared and Yule's Q. What sort of ecological relationship, if any, exists between Salvia apiana and Avena barbata at this scale of analysis?
_________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________![]()
Figure 1 Map of Oats and Sage
![]()
![]()
Figure 2 Critical Values for Chi-Square (X2crit)
alpha df 0.100 0.050 0.025 0.010 0.005 1 2.706 3.841 5.024 6.635 7.879 2 4.605 5.991 7.378 9.210 10.597 3 6.251 7.815 9.348 11.345 12.838 4 7.779 9.488 11.143 13.277 14.860 5 9.236 11.070 12.832 15.086 16.750 6 10.645 12.592 14.449 16.812 18.548 7 12.017 14.067 16.013 18.475 20.278 8 13.362 15.507 17.535 20.090 21.955 9 14.684 16.919 19.023 21.666 23.589 10 15.987 18.307 20.483 23.209 25.188 11 17.275 19.675 21.920 24.725 26.757 12 18.549 21.026 23.337 26.217 28.300 13 19.812 22.362 24.736 27.688 29.819 14 21.064 23.685 26.119 29.141 31.319 15 22.307 24.996 27.488 30.578 32.801 16 23.542 26.296 28.845 32.000 34.267 17 24.769 27.587 30.191 33.409 35.718 18 25.989 28.869 31.526 34.805 37.156 19 27.204 30.144 32.852 36.191 38.582 20 28.412 31.410 34.170 37.566 39.997 21 29.615 32.671 35.479 38.932 41.401 22 30.813 33.924 36.781 40.289 42.796 23 32.007 35.172 38.076 41.638 44.181 24 33.196 36.415 39.364 42.980 45.558 25 34.382 37.652 40.646 44.314 46.928 26 35.563 38.885 41.923 45.642 48.290 27 36.741 40.113 43.195 46.963 49.645 28 37.916 41.337 44.461 48.278 50.994 29 39.087 42.557 45.722 49.588 52.335 30 40.256 43.773 46.979 50.892 53.672 40 51.805 55.758 59.342 63.691 66.766 50 63.167 67.505 71.420 76.154 79.490 60 74.397 79.082 83.298 88.379 91.952 70 85.527 90.531 95.023 100.425 104.215 80 96.578 101.879 106.629 112.329 116.321 90 107.565 113.145 118.136 124.116 128.299 100 118.498 124.342 129.561 135.807 140.170![]()
Figure 3: p-Values for X2calc
X2 1 DF X2 1 DF X2 1 DF X2 1 DF 3.2 .0736 4.4 .0359 5.6 .0180 6.8 .0091 3.3 .0692 4.5 .0339 5.7 .0170 6.9 .0086 3.4 .0652 4.6 .0320 5.8 .0160 7.0 .0082 3.5 .0614 4.7 .0302 5.9 .0151 7.1 .0077 3.6 .0578 4.8 .0285 6.0 .0143 7.2 .0073 3.7 .0544 4.9 .0268 6.1 .0135 7.3 .0669 3.8 .0513 5.0 .0254 6.2 .0128 7.4 .0065 3.9 .0483 5.1 .0239 6.3 .0121 7.5 .0062 4.0 .0455 5.2 .0226 6.4 .0114 7.6 .0058 4.1 .0429 5.3 .0213 6.5 .0108 7.7 .0055 4.2 .0404 5.4 .0201 6.6 .0102 7.8 .0052 4.3 .0381 5.5 .0190 6.7 .0096 >7.8 <.0050
![]()
first placed on the web: 11/26/98
last revised: 03/06/07
© Dr. Christine M. Rodrigue
![]()