Latest Contributions by EricKrause

‎08-23-2019

Hi, I just noticed that you were asking about creating a Voronoi Map from polygons rather than points. If you just need the geometry of the Voronoi Map and do not need the local statistics, you can do the following: Convert the polygons to point centroids using "Feature To Point" geoprocessing tool. Use the centroids as input to "Create Thiessen Polygons" tool. Optionally, clip the Thiessen polygons to a boundary using the "Clip" tool. This will give the same polygons as the Voronoi Map ESDA tool in ArcMap. Again, it will not have the local statistics, but it will create the same polygons.

‎08-23-2019

Hi Liliana, Unfortunately, the Voronoi Map tool is not available in ArcGIS Pro. Most Exploratory Spatial Data Analysis (ESDA) tools from ArcMap are available as charts in ArcGIS Pro, but the Voronoi Map is not one of them. -Eric Krause

‎07-22-2019

We are happy to announce that Empirical Bayesian Kriging (EBK) has been fully peer-reviewed, and the methodology has been accepted into August edition of the Spatial Statistics journal. https://www.sciencedirect.com/science/article/pii/S2211675319300168 This is the reference you should cite if you need to academically cite Empirical Bayesian Kriging, EBK Regression Prediction, or Empirical Bayesian Kriging 3D in your work or studies. The article contains an overview of the methodology behind EBK, as well as many results from controlled simulations. The results are very impressive, and they show that EBK performs as well (and often better) than other modern interpolation routines in a variety of situations. It also contains recommendations for when the default parameters are sufficient and when more advanced parameters should be considered.

‎07-12-2019

Hi Vito, It is possible to automate the Optimize button using Create Geostatistical Layer, but it it involves make manual edits to XML files, so be careful before doing this. We generally do not recommend automating this type of kriging which is why there is no easy way to do it. Instead, we recommend using Empirical Bayesian Kriging, which is available as a geoprocessing tool for easy automation, and it is much safer to run in an automated fashion (though no interpolation method is completely safe when run without active monitoring). If you really want to automate the Optimize button in kriging, this topic will explain how to create the XML file and what alterations you need to make: https://community.esri.com/thread/80709 That topic is very long with lots of discussion that won't be relevant to you. Here are the relevant posts: The Create Geostatistical Layer tool takes a model source as input. This model source can be a geostatistical layer (either a layer in ArcMap or saved as a .lyr file on disk) or an XML model source. For your purposes, the XML will be easier to use. The tool reads all the interpolation parameters from the model source (type of kriging, nugget, range, sill, transformations, etc) and applies them to a new dataset. This is useful for something like temperature data taken daily. You can build the model for one day and easily apply that model to each subsequent day. However, if you're using different kinds of data (temperature, elevation, pollution, etc), you don't want to keep using the same model over and over because the interpolation parameters will not fit different types of data. So, you need to tell the tool to recalculate all these parameters (explained below) for each new dataset. You only need to manually create one XML file source. You can do this with the Geostatistical Wizard. Open the Wizard, choose Ordinary kriging, and give it a dataset. When you click Finish, the Method Report screen will pop up that shows all the parameters that you used. Click "Save..." and save the XML file in a convenient location. You then need to open the XML file with a text editor (Cooktop is a useful and free XML editor, but you can do it with Notepad too). Inside the XML, you'll see all of the parameters, and most of them will have auto = "false" after them. This auto flag tells the tool whether to recalculate that parameter or keep it fixed when it is used as a model source. For every parameter that you want to be recalculated, you need to change the flag to auto = "true" and save the XML. You only need to do this once, and you'll keep reusing this same model source. You can then set up a loop in Python to iterate through your datasets. For each dataset, you'll use the XML as the model source in Create Geostatistical Layer. This will generate a geostatistical layer for each dataset, where the interpolation parameters have been recalculated, and you can use these layers to create the ASCII files that you described in your first post. The difficulty in automating Ordinary/Simple kriging is one of the many reasons we made Empirical Bayesian Kriging as a geoprocessing tool. If you are using ArcGIS Pro, you'll find the option to save the XML in a dropdown menu on the Method Report screen of the Geostatistical Wizard. In ArcMap, it is a button. The specific changes you need to make to the XML file to automate the optimize button is explained in this post: You can recreate the defaults from the Geostatistical Wizard by adding one line to your XML file. You should see a line at the top of the XML that says: <model name="Kriging"> Change this line to: <model name="Kriging" optimize = "BySill"> You actually don't need to change any "auto" flags. This "optimize" flag will override any auto flags, so it doesn't matter if they are false or true. If you want to automate the Optimize Model button in the Geostatistical Wizard, you can change that same line to: <model name="Kriging" optimize = "ByCrossvalidation"> You will want to use the "ByCrossvalidation" option, and you don't need to worry about "auto" flags. Make a copy of the XML before you edit it because it is very easy to accidentally corrupt it.

‎06-20-2019

I agree with Dan. The Predicted vs Measured line is fairly flat, and the scatterplot is pretty noisy. Also, the values in the Error column are fairly large relative to the measured values (often above 50% of the measured value). If I were you, I would be much more concerned about accuracy than bias. I don't see any reason to doubt that the model is unbiased, but these numbers and graphs are pretty typical when the data does not have very much autocorrelation or is very noisy.

‎06-20-2019

The mean error is the average of all the cross validation errors. A positive error means that the predicted value is larger than the true value, and a negative error means that the predicted value is less than the true value. For unbiased models, the underpredictions should cancel out the overpredictions on average, and the mean error should be close to zero. Sometimes it will be little bit negative, sometimes a little bit positive, but if it is close to zero, you have evidence that the model is unbiased. Your value literally means that on average, the cross validation predictions were 0.0081 lower than the true values. Since I can see in your screenshot that your measured values range from about 0.03 to 0.24, a value of -0.0081 is negligible. If the mean error were instead something like 0.1, that would be very concerning, as the level of bias would be as large as most of the measured values.

‎04-25-2019

That particular formula is the definition of the kriging model. The variable s always represents an arbitrary spatial location. Z(s) is the measured value at location s. Since the goal is to interpolate, you want to predict Z(s) at every s where you didn't take a measurement. The equation is saying that the measured value Z(s) is equal to the sum of a mean function µ(s) and a spatially autocorrelated error function ε(s). For your paper, I would write something along those lines. The differences between the various kriging models usually depend on how these mean and error terms are defined and estimated. There is also a common convention in statistics that is very important but also very easy to miss if you don't know to look for it. Look at the following two nearly identical models: Z(s) = µ(s) + ε(s) [Universal kriging] Z(s) = µ + ε(s) [Ordinary kriging] The only difference in the model definition between universal and ordinary kriging is µ versus µ(s). Since s refers to a location and µ refers to the mean, the notation µ(s) indicates that the mean depends on the location. Similarly, the notation µ indicates that the mean does not depend on the location; in other words, the mean is constant at every location. This is very significant because a mean value that changes from location to location is usually called a trend. The real difference between those two models is that the first supports a trend and the second doesn't, and all of this gets indicated by a tiny difference in notation. But even once the model definition is understood, there are still formulas and equations for estimating the mean value or trend, more formulas for estimating the autocorrelated error term (this is where the semivariogram comes into play), more formulas for neighborhoods, predictions, transformations, etc. It's a lot to unpack, but that's why the Geostatistical Analyst help is very long and spread across many topics. -Eric

‎04-25-2019

Hi Elisios, I think part of the confusion is that these two topics are written for very different purposes. The first link is to the documentation for the Kriging tool in the Spatial Analyst toolbox. It is a quick reference for how kriging works from beginning to end. It is all you should need to read in order to understand how to use the Kriging tool. The second link is to a topic in the Geostatistical Analyst extension. This extension was designed specifically for complex kriging workflows, so the documentation goes into a lot more depth and is spread through many topics. This particular topic is very narrow in scope, and it is specifically about how the different kriging models (ordinary, simple, universal, cokriging, etc) differ from each other in their most basic definitions. These definitions involve autocorrelation and definitions of the mean function, and knowing how these models are defined will hopefully help you decide which one is most applicable for your data. But this topic alone is not nearly enough to understand how kriging works. Here is a link to an old Geostatistical Analyst manual that has many formulas in the appendix that are not in the online help documentation: http://dusk2.geo.orst.edu/gis/geostat_analyst.pdf If there are any particular formulas that you have questions about, let me know. -Eric

‎04-17-2019

(1) In Moving Window Kriging, the number of neighbors is the number of points that will be used to estimate the semivariogram parameters at each location. To estimate the semivariogram parameters as a location, the software must use data from around the location (these points are called neighbors). You want a sufficient number of neighbors to be able to estimate the semivariogram parameters accurately, but you also want the estimate to be local, so you don't want to use more neighbors than is necessary. Using 30 points is usually sufficient. (2) The partial sill plus the nugget will equal the variance of the data. I'm not entirely clear what you mean by "spatial variance." The semivariogram itself is what defines spatial covariance (ie, how correlated points are, given how far apart they are).

‎04-17-2019

Hi Liu, The Moving Window Kriging geoprocessing tool is designed to calculate local semivariograms at particular locations. The workflow you need to follow is: If your grid is a raster, you will need to convert it to a point feature class using Raster To Point. Use the Geostatistical Wizard to interpolate these points using kriging. Choose the kriging type (Simple, ordinary, etc), define any options like detrending or transformation, and choose a semivariogram model (Exponential, Stable, K-Bessel, etc). Do not worry about parameters like the range or sill yet; this interpolation is only to make a template for use in the next step. Use Moving Window Kriging and provide the geostatistical layer created in step 2 as the model source. You'll need to provide the locations where you want to calculate the semivariogram (use the same points you used to interpolate), and choose how many neighbors you want to use for each local calculation. The output will be a point feature class containing (among other things), the nugget, range, and partial sill calculated at each location. These are based on the kriging type and semivariogram model chosen in the Geostatistical Wizard and on the number of neighbors defined in Moving Window Kriging. Let me know if you have any other questions or need any clarifications. -Eric

‎03-14-2019

If the problem is picking between various models that all look good, that's a pretty good problem to have. As you said, probably just go with the one with the smallest RMS, but you should definitely note that different models with slightly varying parameters gave nearly identical results. That is a very good thing, as it means that your predicted values are robust. There actually is a tool called Semivariogram Sensitivity that does a very similar workflow, it may save you some time. You give a kriging model as input, then give some tolerances to the semivariogram parameters, and it tries random combinations and produces predicted values and standard errors at a new set of locations. The idea is that if the predicted values and standard errors don't change much for different parameter combinations, this means that the somewhat arbitrary choice of parameters didn't have a huge impact. [I've edited this post. It previously had some incorrect information.]

‎03-13-2019

Hi Oliver. I know you're going to hate this answer, but the importance of different cross validation statistics depends a lot on your particular workflow and requirements. For example, if you only need predicted values and don't need standard errors (measures of uncertainty of the predicted values), then the Average Standard Error and RMS Standardized both become unimportant. The RMS (not the standardized RMS) directly measures how close the predicted values are to the measured values. The Mean and Mean Standardized both measure model bias; ie, whether there is a tendency to under- or over-predict the values (an unbiased model will have Mean and Mean Standardized values close to 0). Combining these two, you have information both about model accuracy and model bias, and it is up to you to decide which of these properties is most important. That being said, here is the workflow that I usually follow. I consider the RMS Standardized to be a sanity check. If it is less that 0.8 or more than 1.2, I usually reject the model before even looking at other statistics. I'll then look at the Root Mean Square and decide if it is acceptable. Because it is in the units of the data, it gives an average margin of error for prediction (for example, if the RMS is 3, then on average each predicted value will be off by 3 from the true value at the location). Whether this margin of error is acceptable is going to depend on your workflow. If the margin of error is acceptable, I then move to the Mean and decide if the level of bias is acceptable. Again, the Mean is in data units, so it directly measures, on average, how much the values are under- or over-predicted. Is it acceptable if the model on average estimates values that are 0.5 higher than the measured values? Like before, it heavily depends on your workflow and the data. Additionally, do not discount common sense and expert knowledge. If the cross validation results look good, but you know the surface is incorrect, you are completely justified in rejecting the model. All the software knows is the locations of the points and a number attached to them, and it is doing its best to detect patterns and correlations. But if you know that the patterns and correlations don't actually hold up out of sample, don't feel obligated to use them. I also found this paper that discusses different approaches to groundwater interpolation. They ultimately recommend "non-colocated cokriging." They essentially used historical data sampled at separate locations as a cokriging variable. This ended up outperforming every univariate interpolation method.

‎02-15-2019

Thanks Dan. There is a little draping bug you can see around 6:23 in the video, but it was fixed before Pro 2.3 was released.

‎02-14-2019

‎02-14-2019

The "Model #" controls don't refer the primary/secondary dataset. They're actually used to mix several semivariogram models together into a single new semivariogram. For example, you can use the Stable model (which is the default), or you can change it to, for example, Spherical. But you can also create a new semivariogram that is an average between Stable and Spherical (weighted averages of valid semivariograms are themselves valid semivariograms) by providing one semivariogram into Model 1 and the other into Model 2 (and even a third into Model 3). This feature isn't often used, but my understanding is that it is useful in situations where the data are affected by two different processes, one short-range and the other long-range. In that case, you can mix one semivariogram with a short range with another semivariogram with a long range, and the resulting average will usually be better than either of the components individually. As for how to tell the difference between the primary and secondary datasets, look for "Var 1" and "Var 2" (primary and secondary). For example the semivariogram display for "Var 1 - Var 1" shows the semivariogram for the primary variable. "Var 2 - Var 2" shows the semivariogram for the secondary dataset. "Var 1 - Var 2" shows the cross covariance. There will be similar "Var #" labels for the parameters on the right to distinguish between the three models.

Online Status	Offline
Date Last Visited	‎02-25-2026 06:39 PM

My Ideas

Latest Contributions by EricKrause

Re: Voronoi Map

Re: Voronoi Map

"Evaluation of empirical Bayesian Kriging" published in August edition of "Spatial Statistics"

Re: How can I do batch processing with Geostatistical Wizard (not the tool Geostatistical analyst) in ArcGIS specifically for Kriging?

Re: What does a negative value of Mean Error indicate?

Re: What does a negative value of Mean Error indicate?

Re: Help understanding the kriging formula

Re: Help understanding the kriging formula

Re: How to calculate spatial variability/semivariogram for each grid cell?

Re: How to calculate spatial variability/semivariogram for each grid cell?

Re: How to assess the quality of kriging results?

Re: How to assess the quality of kriging results?

Re: New in ArcGIS Pro 2.3: 3D Interpolation with Empirical Bayesian Kriging

Re: New in ArcGIS Pro 2.3: 3D Interpolation with Empirical Bayesian Kriging

Re: Lag size

Re: Why exporting EBK 3D to voxel layer change org...

Re: Access geostatistical layer created using ArcP...

Re: Why exporting EBK 3D to voxel layer change org...

Re: K-Bessel Semivariogram Equation

Re: Kriging Model Types

R-ArcGIS