I have a data set that I'd like to interpolate using Kriging. I know that ArcGIS can generate a variance raster alongside the interpolation result that shows the likely variance from the interpolation.

However, the z values in my input data have uncertainties of their own. Is there a way to have these uncertainties reflected in the output of the Kriging process? In other words, can the output variance raster express the variance that takes into account the uncertainty of the input data?

Cheers,

Steve.

However, the z values in my input data have uncertainties of their own. Is there a way to have these uncertainties reflected in the output of the Kriging process? In other words, can the output variance raster express the variance that takes into account the uncertainty of the input data?

Cheers,

Steve.

I have this exact same issue. How do you go about doing this in Geostatistical Analyst? (Yes I do have the extension)

Hi Crystal,

When there are uncertainties in the input data, we call that measurement error. Measurement error can be handled in several different ways, depending on your data.

Do each of your measured values have the same measurement error (ie, is the uncertainty the same for every value), or does the measurement error change from point to point?

My measurement error changes from point to point.

Thanks for your help!

To incorporate heterogeneous measurement error (error that varies from point to point), you will need to use geostatistical simulations.

Here is the outline of the workflow:

What you should expect to see after running the tool is that the kriging predictions (Mean raster) will be very close to to the kriging predictions without specifying measurement error. The standard errors (Standard Deviation raster) will be larger than they were without specifying measurement error. They will be larger because the uncertainty in the input is propagated correctly to the uncertainty in the output.

Let me know if you have any other questions or need any clarifications.

I am a little unsure about #7 and the meaning of "one standard deviation of

measurement error". Is there a formula I could use to calculate this value

from mean, standard deviation, standard error, variance, etc.?

In #8, size seems to be predetermined when I started adding variables.

Should I just leave the size that it computes alone or is there some way I

should calculate that separately?

Thanks again for you help!

The cell size will default to 1/250 of the width or height of the raster. You can freely change this value or just use the default. This just changes the resolution of the raster, so if you want high resolution in the pixels, you can make the value smaller. It's really up to you and doesn't have any impact on the geostatistics.

About the measurement error, a little explanation is needed. The measurement error model that we use assumes that each location has some true, underlying value. However, when this true value is measured, there will be measurement error such that the measured value will not be identical to the true value.

The model assumes that the measured value at a location is equal to the true value at the location, plus some random noise. This noise is assumed to follow a normal (Gaussian) distribution, where the mean of the normal distribution is equal to the true, underlying value. What you need to provide to the tool is the standard deviation of this normal distribution. The larger the standard deviation, the more noisy the measurement.

Unfortunately, there is no way to really calculate this standard deviation for each location. You have to just know it somehow. This information sometimes can be found in the manufacturing documentation of whatever device you used to take the measurements, and sometimes it is known from past research or for physical reasons.

This actually works really well for me. My data points are trend values

calculated from daily site data over 20 years. When I calculate the trend,

I also calculate standard error of the trend estimate, which is exactly

what you're asking for. (Let me know if I'm getting that wrong though). All

data at each site is taken by the same instruments and already corrected

for so I'm not worried about the instrument-to-instrument error, just the

error in my trend estimates.

One last thing, my the simulation (using 1000 for Number of Realizations)

has been running for about two hours now. Is that usual or should I lower

that number?

Thanks for all the detail you've provided. It's been invaluable.

Yes, that sounds like an ideal setup. In fact, that is exactly why measurement errors are passed as standard deviations. The most common source of measurement error is when the measured values aren't "measured" at all; instead, they are outputs or aggregations of some other model (like a long-term trend, in your case). These other models often calculate standard errors or standard deviations, and these can be propagated directly as measurement errors.

As for the time of calculation, I assume the process has either completed or you canceled it already. The biggest contributors to the computation time are the number of points, the output cell size, and the number of simulations. You can't really change the number of points, so you can either reduce the number of realizations or increase the cell size to speed up the process. Increasing the cell size will make the output more pixelated, and reducing the number of simulations will reduce precision in the calculated predictions and standard deviations. I can't really give more specific recommendations than that without looking at your data.

Perfect!

The process ended in an error so I lowered the number of simulations to 50

and it worked fine. I'll slowly bump up the number of realizations to

figure out what my clunky, old work computer can handle. I think it's my

computer, not the number of points I have (I only have 160 sites across the

contiguous U.S.).

Thanks again for you help. I could not have done this without your

thoughtful and detailed responses!