"Standard Error" GA Layer to Points != "Prediction Standard Error Map"

4638
5
07-08-2014 12:58 PM
DaveWetta
New Contributor

I thought I had my head around cross-validation in Geospatial Analyst, but I find the following apparent discrepancy confusing/troubling.

=====

I have split my input data into training and test data sets using "Subset Features".

I have built a simple kriging model using the training data set.

I use GA Layer to Points using the training data set kriging model layer, and select the test data set as Input Point Observation Locations, creating a "Standard Error" field (among others).  My understanding is this is the square root of the kriging variance at those points.

I right-click on my training data set kriging model layer and select "Change Output to Prediction Standard Error".  I thought this was also displaying the square root of the kriging variance.

- The values on the "Prediction Standard Error Map" are significantly lower at my test point locations than the "Standard Error" values from the GA Layer to Points step (loosely 1/5 of the value).

- I've confirmed this by converting the Prediction Standard Error Map to a Raster, and extracting values to the same test data set I used GA Layer to Points on, and comparing both values at all the points.

=====

Are the "Standard Error" values from "GA Layer to Points" not the same as the "Prediction Standard Error" from the "Prediction Standard Error Maps"?

Am I using one of the tools wrong?

Any clarification greatly appreciated.  Let me know if any better description/explanation/elaboration from my end will help.  I'm using ArcGIS 10.1 SP1

Dave Wetta

Tags (1)
0 Kudos
5 Replies
EricKrause
Esri Regular Contributor

** EDIT: This post contains some incorrect information.  See later posts for correction **

The values of the Prediction Standard Error map and the Standard Error field from GA Layer to Points should match, as they are indeed calculating the same thing.  I did a quick test, and they are matching for me. 

I suspect that the problem is in how you are querying the value of the standard error from the geostatistical layer.  Converting to raster and extracting the value will introduce some error (because it is using the cell center, not the exact location of the point).  The Identify tool can also be misleading if you are zoomed out too far (because you have to click exactly on the center of the point).

I was able to confirm that the values match by zooming in as far as ArcMap would let me, then I used Identify to query the value of the geostatistical layer and of the output from GA Layer to Points.  Try this with your data and let me know what happens.

0 Kudos
DaveWetta
New Contributor

Thanks so much for your prompt reply.

No, I do not think this is an issue of rounding, interpolation or query method.  The values aren't even close.

The standard error from GA to Layer Points doesn't have a value lower than 5 for any test point.  The Prediction Standard Map has values lower than 2 for most of my test points.

As I stated originally, the standard error from GA to Layer Points are about 1/5 the value of the Prediction Standard Error map values. (Although now that I look closer, the two error values are much closer together at the high values than at the low values)

Also, the heart of my data on the Prediction Standard Error map are in a large continuous blob of values well under 2, so I don't think it has anything to do with the raster geometry (which has a relatively fine resolution as well).

It is encouraging to hear this isn't a general problem, and my understanding of the tools and concepts is largely correct...

But I still need to figure out what is happening with my data.  I'm glad to share the data and layer if there's a convenient way to do so, or provide any other details or screen captures that help.

Thanks again,

Dave Wetta

0 Kudos
EricKrause
Esri Regular Contributor

Can you please make a map package (mpk) and send it to ekrause@esri.com?  Make sure to have your geostatistical layer, the training points, and the test points in the package.  Thanks.

0 Kudos
EricKrause
Esri Regular Contributor

Here are the steps to make a map package, in case you have never done it:

http://resources.arcgis.com/en/help/main/10.1/index.html#//006600000403000000

Make sure to remove any basemaps or unrelated data before creating the mpk.  Thanks.

0 Kudos
EricKrause
Esri Regular Contributor

I was recently reminded that I never followed up on this thread.  We were able to figure out what happened, and I actually learned something I didn't know about our tools.

GA Layer to Points operates slightly differently depending on whether or not you provide a validation field.  If you do not provide one, it is simply extracting the value of the geostatistical layer.

 

If you provide a validation field, things get a little complicated.  In geostatistics, there is a distinction between the true process and the measured process.  We assume that each location does have a true value, but when you attempt to measure it, you introduce measurement error.  We say that the measured value is the sum of the true value plus random noise.  When you provide a validation field, that field is assumed to contain measured values, not true values.  To validate against them correctly, you have to use the standard error of the measured process, which is always larger than the standard error of the true process.

 

Things get extra complicated if you apply a transformation because then the predicted value actually depends on the standard error.  In that case, both the predictions and the standard errors will be different depending on whether you provide a validation field.

 

The issue is discussed thoroughly in this paper:

Krivoruchko, K., A. Gribov, and J. M. Ver Hoef, 2006, "A new method for handling the nugget effect in kriging," T. C. Coburn, J. M. Yarus, and R. L. Chambers, Eds., Stochastic modeling and geostatistics: Principles, methods, and case studies, volume II: AAPG Computer Applications and Geology 5, p. 81–89.

0 Kudos