Skill assessment of kriging

1722
8
08-07-2012 07:14 PM
DeanMorgan
New Contributor
Hi,

I have a kriged GA layer of precipitation data from approximately 50 points (i.e. stations) and would like to assess the skill of the interpolated surface.

I would like to omit some of the stations prior to kriging, and calculate the mean absolute error between interpolated and actual data.Then repeat many times to give a distribution of errors.

Could anyone suggest an efficient method of carrying out the above task using the Geostatistical Wizard? For example, how can I remove some point data from the kriging without changing the source of the data?

Also, is the 'average standard error' equivalent to the 'mean absolute error'?

Thank you for your help, it is greatly appreciated.
0 Kudos
8 Replies
EricKrause
Esri Regular Contributor
What you're referring to is called validation. 

Here's how you do it:
1.  Use Subset Features to randomly partition your data into a training set and a testing set (in the tool, you specify how many points will in each subset). 
2.  Perform kriging in the Geostatistical Wizard on the training features.
3.  Use the kriging surface as input to GA Layer to Points. Predict to the testing features, and specify the field to validate on (the filed you used to interpolate).  This will create a new point feature class with validation statistics. 
4.  To calculate the mean absolute error, you'll use the "Error" field in the new feature class.  Make a new field and calculate the absolute value of the error.  Then take the average of these absolute values.  To calculate the average standard error, take the average of the "Standard Error" field.
0 Kudos
DeanMorgan
New Contributor
Hi Eric,

A saviour once again. Should I be getting something like this? [ATTACH=CONFIG]16778[/ATTACH]

Take the average of those values in the 'error' column (the 5 originally omitted), repeat a 100 times say, to give a distribution of MAE?

Thanks again!
0 Kudos
EricKrause
Esri Regular Contributor
That table has the correct fields, but all the error values of zero are a bit unusual.  Did you predict back to the same data that you used to build the model?  Validation really only makes sense when you validate on points that were not used in the interpolation.  That is why Subset Features creates two different outputs: one for interpolating (the "training" features) and one for validating (the "test" features).
0 Kudos
EricKrause
Esri Regular Contributor
That table has the correct fields, but all the error values of zero are a bit unusual.  Did you predict back to the same data that you used to build the model?  Validation really only makes sense when you validate on points that were not used in the interpolation.  That is why Subset Features creates two different outputs: one for interpolating (the "training" features) and one for validating (the "test" features). 

"Take the average of those values in the 'error' column (the 5 originally omitted), repeat a 100 times say, to give a distribution of MAE? "

Remember to take the absolute value of the errors before taking the average.  You want the mean absolute error, not the mean error.
0 Kudos
DeanMorgan
New Contributor
Thanks Eric, more like this right? [ATTACH=CONFIG]16784[/ATTACH]

I imagine it will take me an awful long time to repeat 100 times, is there a way to speed this process up?

Also, by absolute, just ignore the negative integers right.

Thanks!
0 Kudos
EricKrause
Esri Regular Contributor
That table looks correct now.

About iterating the workflow many times, you cannot automate the Geostatistical Wizard, so there's no way to fully automate this workflow.  You could certainly write a Python script tool or a model in Model Builder to calculate the MAE once you've created the kriging layer.
0 Kudos
EricKrause
Esri Regular Contributor
Also, by absolute, just ignore the negative integers right.


Sorry, I missed this earlier.  In the Field Calculator, there's a function called Abs().  That takes the absolute value of a field.  Negative values become positive, and positive values don't change.  For example, if you had three values, (-2 , 5, -7), taking the absolute value would result in (2 , 5 , 7).
0 Kudos
DeanMorgan
New Contributor
Thank you again Eric...Worked great for MAE distribution
0 Kudos