I have this exact same issue. How do you go about doing this in Geostatistical Analyst? (Yes I do have the extension)
Hi Crystal,
When there are uncertainties in the input data, we call that measurement error. Measurement error can be handled in several different ways, depending on your data.
Do each of your measured values have the same measurement error (ie, is the uncertainty the same for every value), or does the measurement error change from point to point?
My measurement error changes from point to point.
Thanks for your help!
To incorporate heterogeneous measurement error (error that varies from point to point), you will need to use geostatistical simulations.
Here is the outline of the workflow:
What you should expect to see after running the tool is that the kriging predictions (Mean raster) will be very close to to the kriging predictions without specifying measurement error. The standard errors (Standard Deviation raster) will be larger than they were without specifying measurement error. They will be larger because the uncertainty in the input is propagated correctly to the uncertainty in the output.
Let me know if you have any other questions or need any clarifications.
I am a little unsure about #7 and the meaning of "one standard deviation of
measurement error". Is there a formula I could use to calculate this value
from mean, standard deviation, standard error, variance, etc.?
In #8, size seems to be predetermined when I started adding variables.
Should I just leave the size that it computes alone or is there some way I
should calculate that separately?
Thanks again for you help!
The cell size will default to 1/250 of the width or height of the raster. You can freely change this value or just use the default. This just changes the resolution of the raster, so if you want high resolution in the pixels, you can make the value smaller. It's really up to you and doesn't have any impact on the geostatistics.
About the measurement error, a little explanation is needed. The measurement error model that we use assumes that each location has some true, underlying value. However, when this true value is measured, there will be measurement error such that the measured value will not be identical to the true value.
The model assumes that the measured value at a location is equal to the true value at the location, plus some random noise. This noise is assumed to follow a normal (Gaussian) distribution, where the mean of the normal distribution is equal to the true, underlying value. What you need to provide to the tool is the standard deviation of this normal distribution. The larger the standard deviation, the more noisy the measurement.
Unfortunately, there is no way to really calculate this standard deviation for each location. You have to just know it somehow. This information sometimes can be found in the manufacturing documentation of whatever device you used to take the measurements, and sometimes it is known from past research or for physical reasons.
This actually works really well for me. My data points are trend values
calculated from daily site data over 20 years. When I calculate the trend,
I also calculate standard error of the trend estimate, which is exactly
what you're asking for. (Let me know if I'm getting that wrong though). All
data at each site is taken by the same instruments and already corrected
for so I'm not worried about the instrument-to-instrument error, just the
error in my trend estimates.
One last thing, my the simulation (using 1000 for Number of Realizations)
has been running for about two hours now. Is that usual or should I lower
that number?
Thanks for all the detail you've provided. It's been invaluable.
Yes, that sounds like an ideal setup. In fact, that is exactly why measurement errors are passed as standard deviations. The most common source of measurement error is when the measured values aren't "measured" at all; instead, they are outputs or aggregations of some other model (like a long-term trend, in your case). These other models often calculate standard errors or standard deviations, and these can be propagated directly as measurement errors.
As for the time of calculation, I assume the process has either completed or you canceled it already. The biggest contributors to the computation time are the number of points, the output cell size, and the number of simulations. You can't really change the number of points, so you can either reduce the number of realizations or increase the cell size to speed up the process. Increasing the cell size will make the output more pixelated, and reducing the number of simulations will reduce precision in the calculated predictions and standard deviations. I can't really give more specific recommendations than that without looking at your data.