Help understanding the kriging formula

EliseosMucaki · ‎04-24-2019

Hello,

I was reading the documentation on kriging on the ArcGIS website, and I am having trouble understanding the difference between the two formulas presented on these separate pages discussing Kriging.

How Kriging works—Help | ArcGIS for Desktop

What are the different kriging models?—Help | ArcGIS Desktop

From my understanding, it seems like the first page describes the general kriging formula, while the second has more to do with autocorrelation. But it's not clear. I was hoping someone could try and clarify what the two formulas represent, or provide a source that does.

Thank you.

EricKrause · ‎04-25-2019

That particular formula is the definition of the kriging model. The variable s always represents an arbitrary spatial location. Z(s) is the measured value at location s. Since the goal is to interpolate, you want to predict Z(s) at every s where you didn't take a measurement.

The equation is saying that the measured value Z(s) is equal to the sum of a mean function µ(s) and a spatially autocorrelated error function ε(s). For your paper, I would write something along those lines.

The differences between the various kriging models usually depend on how these mean and error terms are defined and estimated.

There is also a common convention in statistics that is very important but also very easy to miss if you don't know to look for it. Look at the following two nearly identical models:

Z(s) = µ(s) + ε(s) [Universal kriging]
Z(s) = µ + ε(s) [Ordinary kriging]

The only difference in the model definition between universal and ordinary kriging is µ versus µ(s). Since s refers to a location and µ refers to the mean, the notation µ(s) indicates that the mean depends on the location. Similarly, the notation µ indicates that the mean does not depend on the location; in other words, the mean is constant at every location. This is very significant because a mean value that changes from location to location is usually called a trend. The real difference between those two models is that the first supports a trend and the second doesn't, and all of this gets indicated by a tiny difference in notation.

But even once the model definition is understood, there are still formulas and equations for estimating the mean value or trend, more formulas for estimating the autocorrelated error term (this is where the semivariogram comes into play), more formulas for neighborhoods, predictions, transformations, etc. It's a lot to unpack, but that's why the Geostatistical Analyst help is very long and spread across many topics.

-Eric

View solution in original post

DanPatterson_Retired · ‎04-24-2019

Steve Lynch‌ any other references that your or others use besides those listed in the ArcMap and ArcGIS Pro help?

How Kriging works—Help | ArcGIS Desktop

EricKrause · ‎04-25-2019

Hi Elisios,

I think part of the confusion is that these two topics are written for very different purposes. The first link is to the documentation for the Kriging tool in the Spatial Analyst toolbox. It is a quick reference for how kriging works from beginning to end. It is all you should need to read in order to understand how to use the Kriging tool.

The second link is to a topic in the Geostatistical Analyst extension. This extension was designed specifically for complex kriging workflows, so the documentation goes into a lot more depth and is spread through many topics. This particular topic is very narrow in scope, and it is specifically about how the different kriging models (ordinary, simple, universal, cokriging, etc) differ from each other in their most basic definitions. These definitions involve autocorrelation and definitions of the mean function, and knowing how these models are defined will hopefully help you decide which one is most applicable for your data. But this topic alone is not nearly enough to understand how kriging works.

Here is a link to an old Geostatistical Analyst manual that has many formulas in the appendix that are not in the online help documentation:

http://dusk2.geo.orst.edu/gis/geostat_analyst.pdf

If there are any particular formulas that you have questions about, let me know.

-Eric

EliseosMucaki · ‎04-25-2019

Thank you for this resource. I better understand the purpose of the two pages.

Just to be clear, are both formulas essentially two ways that Kriging can be represented? The purpose of this post was to help me define the second formula [Z(s) = µ(s) + ε(s)] for a document I am writing. Would it be accurate to say that this formula represents Kriging as a "spatial autocorrelated processes, accounting for random error"?

Thanks again.

EricKrause · ‎04-25-2019

That particular formula is the definition of the kriging model. The variable s always represents an arbitrary spatial location. Z(s) is the measured value at location s. Since the goal is to interpolate, you want to predict Z(s) at every s where you didn't take a measurement.

The equation is saying that the measured value Z(s) is equal to the sum of a mean function µ(s) and a spatially autocorrelated error function ε(s). For your paper, I would write something along those lines.

The differences between the various kriging models usually depend on how these mean and error terms are defined and estimated.

There is also a common convention in statistics that is very important but also very easy to miss if you don't know to look for it. Look at the following two nearly identical models:

Z(s) = µ(s) + ε(s) [Universal kriging]
Z(s) = µ + ε(s) [Ordinary kriging]

The only difference in the model definition between universal and ordinary kriging is µ versus µ(s). Since s refers to a location and µ refers to the mean, the notation µ(s) indicates that the mean depends on the location. Similarly, the notation µ indicates that the mean does not depend on the location; in other words, the mean is constant at every location. This is very significant because a mean value that changes from location to location is usually called a trend. The real difference between those two models is that the first supports a trend and the second doesn't, and all of this gets indicated by a tiny difference in notation.

But even once the model definition is understood, there are still formulas and equations for estimating the mean value or trend, more formulas for estimating the autocorrelated error term (this is where the semivariogram comes into play), more formulas for neighborhoods, predictions, transformations, etc. It's a lot to unpack, but that's why the Geostatistical Analyst help is very long and spread across many topics.

-Eric