Thanks. I understand what the L transformation does, so if that is indeed the formula that is being used then it seems like the output.dbf should describe the fields as ExpectedL(Distance) and ObservedL rather than saying K. The .dbf fields differ from the results window box, which labels the output fields Distance and L(d).
My reading on the L transformation suggests that it is used to make graphical interpretation of results more straightforward. In that case, it would be more helpful if the graphical output displayed L(d)-d on the y axis so that the expected line (which from my understanding represents complete spatial randomness) is equal to y=0 rather than a line with a slope of 1. Also, the legend of the graphic should say ExpectedL and ObservedL to be consistent with the formula that is used.
I look forward to hearing your response for why the confidence intervals at some distances do not include the expected value for L (CSR).
Has this issue been resolved? When I apply Ripley's K, I simulate outer boundaries and have a polygon, and the study area is a feature class. The upper and lower confidence level lines still don't follow the expected K.
Best,
Vita
Hi Laurence,
Thanks for your question. There are a couple different reasons that the confidence envelope may not follow the expected line.
1) Differences between weighted and unweighted K function.
When you run K function just on your point features (no weight field), the confidence envelope will tend to follow the Expected Blue line. The confidence envelope is created by taking your point features and (conceptually) throwing them down into your study area (a rectangle if you select minimum enclosing rectangle, otherwise the polygon feature you provide). It repeats this random process of throwing down your points, letting them fall where they may within the study area, for 9, 99, or 999 times. Each time it computes the K function value for all distances and the lower confidence line is derived from the lowest observed L(d) values; the upper confidence line is derived from the largest L(d) values. If the study area is simple (rectangle, circle), the confidence envelope will enclose the expected line (but see #2 and #3 below).
When you run the K function with a Weight Field, the confidence envelope will tend to follow the Observed L(d) line (the red line). In this case the confidence envelope is created by throwing down the feature values (the weights) onto the existing feature locations. The locations themselves remain fixed, only the weights associated with the features are randomly re-distributed for 9, 99, or 999 permutations. Because the spatial distribution of your points restrict where the values can land, the confidence envelope follows the observed L(d) line showing you the range of outcomes given the fixed location of your features.
2) Boundary correction.
The K function works by counting all feature pairs within a given distance of each feature. When you specify NONE for the Boundary Correction method, this counting process is biased near the edges/boundaries. Imagine a circle representing the distance where pairs will be counted. When that circle overlays a point/feature near an edge, a portion of the circle will fall outside the study area where there are no points.... the counts will be smaller because there are fewer pairs within the circle. If there really are no points/features outside the study area, this drop in clustering at increasing distances is valid. If the boundaries are an artifact, you should correct for this undercounting bias by selecting a Boundary Correction method.
3) Study area size.
The K Function is one of two tools in the Spatial Statistics Toolbox that is VERY (VERY) sensitive to study area size (the other tool is Average Nearest Neighbor). Imagine a cluster of points enclosed by a very, very tight study area... with that configuration, the pattern appears dispersed. Now imagine that same cluster of points enclose by a very large study area (so the cluster is at the middle with vast space all around it)... now the points would definitely appear clustered. For a graphic, please see: http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/Multi_Distance_Spatial_Cluster_Analysi... (About the 12th usage tip that starts: "The k-function statistic is very sensitive to the size of the study area.").
4) Study area shape.
In #1 above, I described how the confidence envelopes are constructed. In essence, features are pitched onto your study area, each feature landing where it may. When you have a very convoluted study area, this can impact where features are allowed to land. Hmmm... okay imagine a square study area with two long skinny arms, two long skinny legs, and a head 🙂 Features that fall into the arms and legs will have fewer neighbors because the study area itself doesn't allow many features to fall into the skinny parts... (does that make sense)?
But this kind of thing can also happen if you elect the Minimum Enclosing Rectangle study area when your features aren't very rectangular. Imagine a set of features randomly distributed into a circle. Then imagine a rectangular study area around it. In the corners of the study area there will be no features. When the K function starts counting pairs near those corners, the pair counts will drop. This can result in a drooping confidence envelope for weighted K function.
I hope this helps. If you still have questions, please feel free to contact me. I am happy to look at your data and evaluate the results to see why you might be seeing the drooping confidence envelope even when you apply a boundary correction method.
Best wishes,
Lauren
Lauren M Scott, PhD
Esri
Geoprocessing, Spatial Statistics
LScott@esri.com