I have calculated Ripley’s K function for households and household-events for a district. I get unusual graphs with observed above expected at all distances - 2 almost straight lines. I would expect clustering of households at shorter distances (within communities) and dispersion at larger distances (between communities). I used the aggregate points feature to create the study area polygon. I wondered if my output is likely to be an artefact of households not being possible at all locations (such as forests and water bodies), while for “expected” the assumption is that households can fall anywhere? Appreciate the feedback.
From the help http://desktop.arcgis.com/en/arcmap/latest/tools/spatial-statistics-toolbox/multi-distance-spatial-c...
have you limited the scope of the study area by any means?
The k-function statistic is very sensitive to the size of the study area. Identical arrangements of points can exhibit clustering or dispersion depending on the size of the study area enclosing them. Therefore, it is imperative that the study area boundaries are carefully considered. .....
The study area includes all communities and households in a district. I created the study area using "aggregate points" for households - although not with a very fine aggregation distance as this created separate study area islands across the district which isn't valid input for Ripley's. The households are gathered in communities and along roads etc - my aim was to calculate the K-function for all households and for case-households to then estimate the difference between the 2 functions to indicate any extra aggregation of cases above the background population.
I suspect that showing the pattern you have would be useful. As soon as you mentions areas where houses can't possibly be located, then that invokes a certain picture. On of which you have potentially confirmed... ie your island groupings. As soon as you expand the area of interest, the density pattern changes as shown in the images in the help file.
The image shows part of the study area, as you say there are areas where it would be impossible for houses to be located. I understand how the pattern changes with a smaller or larger area of interest but I can only really use the whole study area, not try to exclude all individual areas of forest/water bodies etc with no houses. Should I still be able to produce a valid analysis?
I won't comment on that, but if linearity is evident in your data, then there is a whole body of statistical tests that apply to network space. There are even papers on extensions/interpretations of ripleys k when the free-space assumptions just can't be met. In short, is this particular test needed, or the one that appears the most promising out of what is offered within Arc*? It has been years since I have looked in this area and most examples came from landscape ecology. Other packages (R ?) might be worth looking into that deal with analysis under different spatial metrics.
Perhaps someone from the Spatial Stats team might have some specific suggestions.