Is there a rule of thumb for the search radius when using empirical bayesian kriging?

6255
7
Jump to solution
06-03-2015 09:08 AM
JohannesKrimm
New Contributor II

I am working on a dataset using empirical bayesian kriging . EDA suggests using Log Empirical trasformation and a K-Bessel detrended semivariogram type. Without further adjustment, I achieved the "best" results (concerning error statistics) when using a smooth circular neighborhood type.

I am now trying to give my results more significance, especially concerning the spatial range of the prediction, by using a appropriate search radius.

The only quote concerning this issue I could find so far is in: Sumner, M. E. (1999): Handbook of soil science, and says: "the search radius should not exceed the range of the semivariogram, and is typically less than ½ the range”.

Is this a good starting point or are there other suggestions?

0 Kudos
1 Solution

Accepted Solutions
EricKrause
Esri Regular Contributor

There aren't a lot of criteria of what is a good search radius, but there is plenty about what is a bad search radius.  It definitely should not be larger than the range of the semivariogram (as you noted), and it should be large enough that it captures at least 10 points everywhere in the data domain.  Other than that, there aren't many recommendations other than comparing validation and crossvalidation statistics for different search radii.

In the Geostatistical Wizard, the default search radius for smooth interpolation is calculated such that it attempts to use at least 32 neighbors in each location.  It isn't a perfect algorithm, but we have found it to be reliable and robust.

View solution in original post

7 Replies
DanPatterson_Retired
MVP Emeritus

If you are working with similar soils data, it would be appropriate.  Are there any other suggestions in the help files?

0 Kudos
JohannesKrimm
New Contributor II

Unfortunately, I could not find any useful tips in the helpfiles.

I am not working specifically on soil data, but on clastic sediments and associated parameters like tranmissivity. So, the quote from the Handbook of soil science was the closest I could get.

Nevertheless, any suggestions would be appreciated.

0 Kudos
EricKrause
Esri Regular Contributor

There aren't a lot of criteria of what is a good search radius, but there is plenty about what is a bad search radius.  It definitely should not be larger than the range of the semivariogram (as you noted), and it should be large enough that it captures at least 10 points everywhere in the data domain.  Other than that, there aren't many recommendations other than comparing validation and crossvalidation statistics for different search radii.

In the Geostatistical Wizard, the default search radius for smooth interpolation is calculated such that it attempts to use at least 32 neighbors in each location.  It isn't a perfect algorithm, but we have found it to be reliable and robust.

JohannesKrimm
New Contributor II

Once again, thank you very much for the insight!

0 Kudos
EricKrause
Esri Regular Contributor

I should also say that it depends on how fast you need the method to process.  In general, it is better to use a search radius that is too big than one that is too small.  If it is too small, you are missing relevant information in the neighboring points, and the quality of your predictions will decline.  If it is too large, you are pulling in information that is not useful, but these non-informative neighbors tend to get very small weights, so they have little impact on the quality of the interpolation.  However, the more neighbors you use, the longer the method takes to calculate.

If you aren't concerned with processing speed, you should err on the side of a larger radius rather than a small one.  That being said, the default of ~32 neighbors is almost always more than enough, and you generally won't see improvement in predictions by adding more neighbors.  But, as always, it depends on your data.

JohannesKrimm
New Contributor II

I appreciate the additional information!

I have now taken my knowledge of the sedimentary conditions and their spatial distribution into account and ended up with a search radius that is about half the average range and incorporates between 25 and 50 sample points, depending on the location of prediction. So I guess I was on the right track.

One last question if you don't mind: Is it safe to assume similar numbers (~32 neighbors) to be sufficient and robust for a standard circular search neighborhood as well, or are there any major differences?

0 Kudos
EricKrause
Esri Regular Contributor

Generally, yes.  Though a standard search neighborhood has more parameters to it.  First, it can use sectors, which force particular numbers of neighbors to come from different directions.  Second, it has minimum and maximum number of neighbors (per sector) as a parameter.

The way the algorithm works is:

  1. For each sector, take at least the minimum number of neighbors (even if that requires going beyond the search radius).
  2. Keep taking additional neighbors until you hit the maximum number of neighbors or hit the end of the searching radius.

So, for a standard search neighborhood, you can get similar behavior by setting the minimum number of neighbors equal to the maximum number of neighbors (no matter what your search radius is).  Remember to take the number of sectors into account.  For example, if you have four sectors, and you set min=max=8, then each location will use exactly 32 neighbors in the calculation (8 from each of the four sectors).  That is, assuming there actually are at least 8 neighbors in each of the sectors.