My question is in regards to using the areal interpolation and data normality.

I am currently working on an M.Sc. thesis with the purpose to describe agricultural soils in my region. To do so, a 20 ha grid was superimposed over the region and for each cell a composite soil samples consisting of 20 soil cores was taken in a W pattern over the entire support. Therefore, since I've taken composite sample my data for every cell (I've used each cell has a separate polygon) can be considered an average for that area.

I would like to clarify the ArcGIS help file for the areal interpolation. It stats that "Areal interpolation for continuous data requires that the data is Gaussian and averaged over defined polygons".

My question is now, does the data need to be Gaussian within each individual polygon or does the overall dataset for the variable need to be normally distributed?

I would imagine due to the nature of my data, if it's only required within the polygon I'll be assume normality and go on with life.

Another question would be how important is normality if its required or suggested for the entire dataset of the variable? The purpose of these maps is just to create a density surface. I will not re-aggregate my data into new polygons.

I would like these maps accompany statistical comparisons of contrasting land uses and soil types in the region. So in other words, the purpose of these maps is to create an easily visually interpretable surface of individual variables for the myself and the reader to better understand their spatial distribution.

Jonathan

Regarding how important the Gaussian assumption is, this is difficult to answer. Kriging theory assumes a Gaussian distribution in order for the predictions to be the "best linear unbiased predictions." If the data is not Gaussian, this property no longer holds. Kriging is known to be fairly robust to non-Gaussian distributions, but none of its attractive statistical properties hold if the data is not Gaussian.

I'll actually change that documentation. Saying areal interpolation "requires" a Gaussian distribution is too strong of a statement. Instead, it should say that the kriging equations assume a Gaussian distribution.