POST
|
Just to clarify the point on the convergence between the confidence envelope and the observed line from the previous post. Theoretically, they should indeed converge at the MAX(L(t)) for a given study area size and the number of points - but only at a distance of at least half of the maximum dimension of the study area. And in cases of geometrically simple study areas and unweighted K simulations, the confidence envelopes should not deviate too much from the Expected line (apart from the usual boundary-effect drop-off at larger distances) �?? as Lauren has repeatedly mentioned. The problem is, unweighted simulations sometimes seem to behave similar to the weighted ones�?�
... View more
04-19-2012
10:17 PM
|
0
|
0
|
1388
|
POST
|
Hi Laurence, Thanks for your question. There are a couple different reasons that the confidence envelope may not follow the expected line. 1) Differences between weighted and unweighted K function. When you run K function just on your point features (no weight field), the confidence envelope will tend to follow the Expected Blue line. The confidence envelope is created by taking your point features and (conceptually) throwing them down into your study area (a rectangle if you select minimum enclosing rectangle, otherwise the polygon feature you provide). It repeats this random process of throwing down your points, letting them fall where they may within the study area, for 9, 99, or 999 times. Each time it computes the K function value for all distances and the lower confidence line is derived from the lowest observed L(d) values; the upper confidence line is derived from the largest L(d) values. If the study area is simple (rectangle, circle), the confidence envelope will enclose the expected line (but see #2 and #3 below). When you run the K function with a Weight Field, the confidence envelope will tend to follow the Observed L(d) line (the red line). In this case the confidence envelope is created by throwing down the feature values (the weights) onto the existing feature locations. The locations themselves remain fixed, only the weights associated with the features are randomly re-distributed for 9, 99, or 999 permutations. Because the spatial distribution of your points restrict where the values can land, the confidence envelope follows the observed L(d) line showing you the range of outcomes given the fixed location of your features. 2) Boundary correction. The K function works by counting all feature pairs within a given distance of each feature. When you specify NONE for the Boundary Correction method, this counting process is biased near the edges/boundaries. Imagine a circle representing the distance where pairs will be counted. When that circle overlays a point/feature near an edge, a portion of the circle will fall outside the study area where there are no points.... the counts will be smaller because there are fewer pairs within the circle. If there really are no points/features outside the study area, this drop in clustering at increasing distances is valid. If the boundaries are an artifact, you should correct for this undercounting bias by selecting a Boundary Correction method. 3) Study area size. The K Function is one of two tools in the Spatial Statistics Toolbox that is VERY (VERY) sensitive to study area size (the other tool is Average Nearest Neighbor). Imagine a cluster of points enclosed by a very, very tight study area... with that configuration, the pattern appears dispersed. Now imagine that same cluster of points enclose by a very large study area (so the cluster is at the middle with vast space all around it)... now the points would definitely appear clustered. For a graphic, please see: http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/Multi_Distance_Spatial_Cluster_Analysis_Ripley_s_K_Function/005p0000000m000000/ (About the 12th usage tip that starts: "The k-function statistic is very sensitive to the size of the study area."). 4) Study area shape. In #1 above, I described how the confidence envelopes are constructed. In essence, features are pitched onto your study area, each feature landing where it may. When you have a very convoluted study area, this can impact where features are allowed to land. Hmmm... okay imagine a square study area with two long skinny arms, two long skinny legs, and a head 🙂 Features that fall into the arms and legs will have fewer neighbors because the study area itself doesn't allow many features to fall into the skinny parts... (does that make sense)? But this kind of thing can also happen if you elect the Minimum Enclosing Rectangle study area when your features aren't very rectangular. Imagine a set of features randomly distributed into a circle. Then imagine a rectangular study area around it. In the corners of the study area there will be no features. When the K function starts counting pairs near those corners, the pair counts will drop. This can result in a drooping confidence envelope for weighted K function. I hope this helps. If you still have questions, please feel free to contact me. I am happy to look at your data and evaluate the results to see why you might be seeing the drooping confidence envelope even when you apply a boundary correction method. Best wishes, Lauren Lauren M Scott, PhD Esri Geoprocessing, Spatial Statistics LScott@esri.com Hi Lauren, I also have a problem with the Ripley's confidence envelopes, almost regardless of the shape of the study area. It is easier to illustrate by an example. There are 11 points, quite obviously arranged in a band, within a quasi-rectangular study area: [ATTACH=CONFIG]13657[/ATTACH] On the Ripley's K graph (unweighted, study area defined, 99 permutations; ArcGIS 9.3.1), the observed line plots above the expected line (as expected for a clustered pattern). However, the confidence envelope closely follows the observed line. Curiously, the envelope converges to a single horizontal line which exactly coincides with the expected line at the distance equal to the distance between the furthermost sample points: [ATTACH=CONFIG]13658[/ATTACH] This appears to suggest that the permutations are not based on true random sets of points, with each point randomly placed within a study area. The convergence to a horizontal line indicates that the 'random' permutations are reproducing essentially the same pattern, almost precisely maintaining the same maximum point separation. ArcGIS outputs in this example are not due to weighting, or the study area shape or size, or the boundary effects - 999 permutations using a much larger precisely rectangular area with the points in the middle produce the same results. The only way I could coax ArcGIS' Ripley's K function to confirm the existence of statistically significant clustering was by adding one or more 'fake' data points significantly removed from the cluster. I would appreciate it if you could advise me how to produce more reliable confidence envelopes. Thank you. Regards, Vladimir
... View more
04-18-2012
11:15 PM
|
0
|
0
|
1388
|
Online Status |
Offline
|
Date Last Visited |
11-11-2020
02:23 AM
|