Incremental Spatial Autocorrelation Model Failing

AmberT · ‎08-10-2011

Hello,

I've been trying to determine a proper distance band for my data. I have 461,565 points (throughout New York State) and my input field is an aggregated field that has a sum of almost 8 million. Every time I run the Incremental Spatial Autocorrelation script, as well as the Spatial Autocorrelation Script (Morans I), it fails with a "<type 'exceptions.MemoryError'>:".

I'm working with a projected data set and the data frame has the matching projection (NAD 1983 UTM Zone 18N) and I when I enter thresholds I make sure I am referring to meters. Is there a limit to the size of the input features? Should I close out other applications when running the script? I am working with files in a geodatabase, I exported the file to a shapefile and created a .SWM and used that in the Spatial Autorcorrelation Script and that worked but I would need to run the script a bunch of times to determine the best distance. I'm running the Incremental Spatial Autocorrelation script right now on the shapefile and it seems to be running better, but slowly.

Thanks,
Amber

AmberT · ‎08-12-2011

After trying to run the Incremental Spatial Autocorrelation Model, on a shapefile, it failed and showed the same memory error. Any ideas on the limit of this tool?

LaurenRosenshein · ‎08-18-2011

Hi Amber,

I'm really sorry you're having trouble with the Incremental Spatial Autocorrelation sample script. At 10.1 the Incremental Spatial Autocorrelation tool will be part of ArcGIS, and we're working really hard to deal with some of the issues that have come up since the release of the sample script. For now, though, there are some things that you can do.

The most likely reason that you're having issues with memory is that at the distances that you're using to test for spatial autocorrelation many of the features have tens of thousands of neighbors. Ideally, you want to use distances that give your features no more than maybe a hundred, a couple hundred, maybe even 1000 neighbors...but no feature should ever have 100,000 neighbors. A good way to see if this is your problem is to run the Generate Spatial Weights Matrix tool for some of your largest distance increments. The tool will tell you the maximum number of neighbors that any feature has. If you are seeing huge numbers there, then that is likely to be your problem with Incremental Spatial Autocorrelation. The solution is to lower the distances that you're testing so that each feature has a more reasonable number of neighbors.

One thing that you may be running into is that the distance at which each feature has at least one neighbor is large, maybe because of outliers (a couple of features that are really far away from all of the other features). A good option is to create a selection set that does not include the outliers and use just those features to figure out a good beginning and increment distance...and ultimately you would run Incremental Spatial Autocorrelation on just the selection set (without the outliers). After you find a peak and choose a threshold distance, you can then use the Generate Spatial Weights Matrix tool to create a weights matrix that uses a threshold distance that you choose, but then you can also choose a minimum number of neighbors. What that will do is for the majority of the features it will use the distance band you created, but for the outliers it will use the minimum distance (since that distance band may be too small for them to have any neighbors). That way you can use a threshold distance that makes sense for the majority of your features, but still include the outliers in your analysis.

JeffreyEvans · ‎08-19-2011

This tool is in essence calculating a Moran's correlogram. I would highly recommend a literature search and reading up on the method. Helen Wagner has published some work that provides guidance on the use and limitations of correlograms. Caution must be used with this method because if the maximum distance bandwidth exceeds the sill of the semivariance the results can be very misleading. The bottom line is that a measure of autocorrelation that is derived from an "all neighbor" spatial weights matrix does not make sense. You should try to formulate a testable hypothesis of autocorrelation (distance ranges or Nth order contingency) based on information of your process. An example of this would testing a range of autocorrelation based on a maximum dispersal distance of an organism.