|Home > HOW to monitor? > Setting sample size|
|Figuring out how many samples you need|
The number of samples you need are affected by the following factors:
Here are some graphs that illustrate some of these trade-offs. These graphs were made using the assumption that you would be analyzing your data using simple linear regression. Each graph isolates one factor and looks how altering that factor affects sample. Those factors are explained in greater detail below
In general, you can lower your sample size requirements by adopting the following approaches:
More details on the costs and tradeoffs associated with these decisions follow.
Note: Any calculation of how many samples you need for your monitoring program should be treated simply as an educated guess. There are too many variables involved that are out of your control. Your real statistical power and your truely optimal number of samples can only be calculated once you have collected your data for several years. So, of equal importance to estimating how many samples you need before you start is checking how well you are doing afterwards. I would recommend re-evaluating how well you are doing after 5 years.
Sample sizes will be defined by what statistical test you use to analyze the data. Different statistical test will require different sample sizes, all else being equal. Consequently, to calculate your sample size or to estimate the statistical power of your sampling regime you must use equations or simulations specific to the statistical analysis technique. Links to such calculations/calculators are given below. If you plan to use a complicated analytical approach you will likely not find software existing to estimate sample sizes. In this case your solution would be to ask a statistician to model it for you or be satisfied with using another calculator.
We know of no web sites that provide guidance on which analytical techniques to use when monitoring. If you know of any let us know and we will link to them.
In order to detect a change in the number of animals you have on your property you need a way of counting them. You need to choose a counting technique (e.g., number of wood frog egg masses in a pool, number of birds detected at a point, number of salamanders under rocks) and repeat it over the years. To be useful for detecting trends these counts must have the following characteristics:
Issues of bias, sample placement, and choosing your counting technique have been discussed elsewhere in this web site. Here we will help you determine whether your monitoring program has a sufficient number of sampling locations (sample size) to detect the types of changes you have set forth as your goal.
So, what is a sufficient sample size?
To answer that you need to address three things:
Count variation is simply a measure of how your counts fluctuate from year to year. Variation affects your ability to detect trends as, obviously, if the data fluctuate greatly you will not have resolution to find an increasing or decreasing trajectory in the population you are monitoring.
Basic rule of thumb: The more variable your counts, the more samples you will need to detect a change or trend of a given magnitude. Conversely, for any given sample size, the more your counts vary the lower your ability to to detect trends.
Sample size calculations need an estimate of count variation to estimate sample sizes. You can get such an estimate from your own pilot data (the mean and standard deviation) or from estimates taken from other, similar situations. We provide some of those estimates for amphibians and birds (point counts and territory mapping). Note that these are calculations of fluctuation over time not over space, meaning that you calcuate a mean and standard deviation of the counts across several years at one point and not a mean and standard deviation among several points.
Be aware, when using counts from other studies, that count variances are specific to the counting technique and how the original study pooled their samples. Additionally, as you can see from these collections of count variances that even when using the same counting technique, on the same species, in the same region the degree of variability in the resulting counts is usually differs (often greatly) from study to study or even from site to site. The good news is that reviews of long-term studies have shown that, at any individual site, the variability in counts remains about the same. This is another strong, strong reason that you need to review your monitoring program after 5 years, to see if you have been adequately sampling your populations.
Basic rule of thumb: Use the estimates of variability of counts from other studies as a general guide to what you might expect in yours, but it is wise to check the variability of your own counts after your program has been established for 5 years to see if you sampling strategy needs to be revised.
Trend. Trend can be defined as change over time. More apropros to a monitoring program, would be to define trend as some specific rate of change over a specific period of time. Most calculations for determining sample size require that you specify a minimum rate of change you would like to detect and a minimum time period over which you would detect those changes. Those minimums now become the targets our monitoring program will aim to achieve or beat. In other words, by appropriately setting our sample sizes we hope to be able to detect a trend as small or smaller than those minimums which we have targeted.
Basic rule of thumb: The smaller the population change you would like to detect, the greater the number of samples you will need to detect it.
Another basic rule of thumb: The fewer the number of years over which you would like to detect a trend, the greater the number of samples you will need.
Grand rule of thumb: Any monitoring program whose goal it is to detect small population changes over just a few years will be expensive to create.
Precision. Calculators of sample size also need to know how precisely you want to measure these changes. An imprecisely measured trend is a very unsatisfactory trend in that you are unsure of how well it really reflects the REAL changes in the animal populations on your land. On the other hand, a very precisely measured trend can be very costly to obtain because you will have to spend a great deal of your budget to achieve that level of precision. So, think of your precision goal as your willingness to risk being wrong about the population change you are trying to measure. You need to determine the amount of risk you are willing to take in your monitoring program and understand the consequences of that decision, both as a cost to your budget and in the probability of being wrong.
Basic rule of thumb: The lower the precision - the lower the number of samples you will need. Conversely, the higher the precision the larger the number of samples needed.
You control precision and risk using two statistical settings: alpha and power. Because most basic statistical books and quite a number of web sites cover these parameters well and are very accessible, we will not cover them here, but we do want to highlight a few considerations relevant to the estimation of sample sizes for monitoring programs.
You now know that count variability affects the number of samples you need as does your requirements for what magnitude of the change you want your program to detect. The last issue that needs to be resolved is what statistical model will you use to test your data.
Basic rule of thumb. The specific formula (or simulations) used to calculate sample size is unique to the statistical test or model that you will use.
Now ... some practical guidance on how to calculate the sample sizes for your monitoring program.
Note: Throughout this document we often use the terms variance, variation, and variability as a short hand expression for the variability of counts. However, understand that the actual mathematical calculation of variability could be any one of several measures (standard deviation, standard error, variance, or coefficient of variation), each of which has a specific statistical meaning.
Basic rules of thumb. You must have determined the following to set samples sizes:
Calculating the mean and standard deviation requires some additional explanation. While the other factors that affect sample sizes are set based on your desired need for precision and the smallest degree of change you want to detect, the mean and variance are factors that are set by the animals being sampled. If you have several years of pilot data you will want to calculate the mean and standard deviation from your own data. If you don’t, then you can use someone else’s data from as similar a situation as you can find to estimate means and standard deviations. In a pinch, you can estimate some of the variation to be expected in a set of yearly counts by calculating a mean and standard deviation from one year’s data if you have several replicates OF THE SAME points or plots. However this approach fails to account for any between-year variation in the animal populations. Finally, you can use data published in the literature or from one of our databases on count variation (e.g. amphibians, bird point counts, bird territory mapping).
Pilot data is far and away the most preferable source of information for determining count variation. We have found that the variation in the counts of animals is very consistent within a site (figure). However, there are often wide differences in the variation of counts among sites, even those close by using the same technique.
Ways to avoid problems when you calculate variances:
Basic rule of thumb: If you have no access to pilot data or are aware of examples from the literature that you trust, a conservative estimate of the amount of variation that you could use in sample size calculations would be to use a CV of 100% with a moderately conservative alternative of 75%.
Choosing an analytical technique also includes some further explanation. The specific calculation of sample sizes is different for every statistical test. In complicated analyses formulas often don’t exist and simulation must be used to calculate sample size. Below are listed a some text and web resources for setting sample sizes for various simple statistical models. For complicated situations you can either run the simulations your self, have a statistician do them for you, or use a conservative model to estimate sample sizes from.
James Gibbs has created a software program that estimates sample sizes for those who will use linear or exponential regression to anlayze their data. As this type of regression is the most basic it is also likely to be the most conservative. Currently his software only runs on Windows XP or 2000. It is available by contacting Sam Droege at the address below. A new version is expected out soon that will run on more platforms.
Desperation rule of thumb: If, for whatever reason, you cannot calculate a reasonable estimate for the number of samples to take, then put in 60 plots/points, under many circumstances that may be sufficient. Obviously the more the better, be sure to review your data after 3 years to re-evaluate this weak choice.
home | START HERE | worksheets | counting techniques | CV tools | site guide
U.S. Department of the Interior
U.S. Geological Survey
Patuxent Wildlife Research Center
Laurel, MD USA 20708-4038
Contact Sam Droege, email firstname.lastname@example.org
USGS Privacy Statement