Hi Winn, I'm really sorry that you're having trouble running OLS and GWR on your feature classes! In order to try to figure out what's going on, if you can provide the following information we can try to reproduce your problem and hopefully fix it! What version of ArcGIS are you using? What operating system are you using? What kind of feature classes are you using? Polygons, points, lines? How many features are there? Are they are in a personal geodatabase, a file geodatabase, or other? Does the problem seem data specific? For instance, if you try to run OLS or GWR on a completely different feature class, does it work? Or do all feature classes cause this failure? If possible, if you could provide the dataset that was giving you trouble that would be very helpful in trying to reproduce. If this is possible, it will also be helpful to get screenshots of the tool dialogs before you run them, so that we know exactly the parameter choices you're making. If not, the above information should at least get us on the right track. Again, I'm very sorry that you're having trouble, and we'll do our best to figure it out! Lauren Rosenshein Geoprocessing Product Engineer
09272011
Hi Mary, I'm sorry you're having trouble running Spatial Autocorrelation. There are some possibilities for why you're seeing this error. One possibility is that you may have a selection set? If you have any features selected the tool will honor that selection, so make sure you clear any selection before you run the tool (unless, of course, you specifically want to analyze that selection set). Another possibility is either null values or invalid geometry. Both of these could also be causing trouble. For the invalid geometry, you may want to try running the Repair Geometry tool and see if that helps. If none of this helps, you can send over your data and we can see if we can reproduce the issue here. Your next question is about recognizing raster layers. The best thing to do is to convert your rasters to points, at which point you can use the Spatial Autocorrelation tool. To the best of my knowledge there is no tool in ArcGIS that calculates spatial autocorrelation statistics for rasters. As far as your last question goes, I'm not 100% sure what you're trying to do, but it sounds like you may want to do a regression analysis, in which case you could use Ordinary Least Squares Regression (for which you will need point data and can do a raster to point conversion). You can learn a lot more about regression analysis in ArcGIS, and the assumptions of OLS in general, from this one hour free training seminar called Regression Analysis Basics .
08182011
Hi Amber, I'm really sorry you're having trouble with the Incremental Spatial Autocorrelation sample script. At 10.1 the Incremental Spatial Autocorrelation tool will be part of ArcGIS, and we're working really hard to deal with some of the issues that have come up since the release of the sample script. For now, though, there are some things that you can do. The most likely reason that you're having issues with memory is that at the distances that you're using to test for spatial autocorrelation many of the features have tens of thousands of neighbors. Ideally, you want to use distances that give your features no more than maybe a hundred, a couple hundred, maybe even 1000 neighbors...but no feature should ever have 100,000 neighbors. A good way to see if this is your problem is to run the Generate Spatial Weights Matrix tool for some of your largest distance increments. The tool will tell you the maximum number of neighbors that any feature has. If you are seeing huge numbers there, then that is likely to be your problem with Incremental Spatial Autocorrelation. The solution is to lower the distances that you're testing so that each feature has a more reasonable number of neighbors. One thing that you may be running into is that the distance at which each feature has at least one neighbor is large, maybe because of outliers (a couple of features that are really far away from all of the other features). A good option is to create a selection set that does not include the outliers and use just those features to figure out a good beginning and increment distance...and ultimately you would run Incremental Spatial Autocorrelation on just the selection set (without the outliers). After you find a peak and choose a threshold distance, you can then use the Generate Spatial Weights Matrix tool to create a weights matrix that uses a threshold distance that you choose, but then you can also choose a minimum number of neighbors. What that will do is for the majority of the features it will use the distance band you created, but for the outliers it will use the minimum distance (since that distance band may be too small for them to have any neighbors). That way you can use a threshold distance that makes sense for the majority of your features, but still include the outliers in your analysis.
08182011
Hi Berk, That's a great question! For line and polygon features, feature centroids are used in distance computations. For multipoints, polylines, or polygons with multiple parts, the centroid is computed using the weighted mean center of all feature parts. The weighting for point features is 1, for line features is length, and for polygon features is area.
08182011
Hi Eleanor, I'm not sure I understand what you mean by "pooling the residuals" from the local models, but I'll try to clarify. Basically, both OLS and GWR ultimately calculate a prediction for each feature in the dataset. For OLS that prediction is based on a global model. For GWR that prediction is based on a local model that was calibrated using nearby features. Either way, each feature ends up with a predicted value. The residuals, whether you are talking about OLS or GWR, are just the difference between that predicted value and the observed value. Once you have those residuals, calculating the R2 value is exactly the same for both methods. The only difference is whether the predicted value was calculated using a global model (OLS) or a local model (GWR).
08182011
Hi Karen, The kind of dummy variables you're talking about are fine along with some more normal variables in OLS. They are also alright in GWR if they are not spatial regime variables. The potential problem is that there is a much higher probability of getting large areas of the same value with only two choices, ie with dummy variables, which then leads to issues with local multicollinearity when using GWR. Try it with GWR (once you've found a properly specified model using OLS). If it is a problem: 1) GWR will not solve and will report Severe Model Design 2) If GWR does solve, be sure to check the condition number ... > 30 indicates problems with local multicollinearity (i.e., the regression models for those features are unstable because of variable redundancy and you cannot trust the results). As for a logit version, at this time there is no logistic regression in ArcGIS. One option is to use R to do a logistic regression. We've got a sample of of integrating R and ArcGIS which you can find here .
08182011
Hi Ed, Actually, what you want to do to get a value field for running the GetisOrd Gi* statistic is aggregate your data so that you can use a count field as the input field for analysis. The Hot Spot Analysis tool will then calculate a zscore and pvalue for each feature based on the statistical significance. You can learn more about Hot Spot Analysis and aggregating your data here . You may also want to run through the Hot Spot Analysis Tutorial . To get more information about all of the Spatial Statistics tools you can also go to http://esriurl.com/spatialstats .
08152011
Hi Tim, That's a great question! Actually, almost all of the spatial statistics tools are python scripts, which means that you have access to the underlying code! If you right click on the Generate Spatial Weights Matrix tool and choose to "Edit" it, it will open in your script editor of choice. From there you can go ahead and make any changes that you want! My suggestion would be to save the original script files before you start making changes, just to make sure you can always revert back if need be. 🙂
08152011
Hi Eif, It looks like you are using the right Log function in ArcGIS. The " Log " function using VB is the same as the " Ln " function in Excel...both use the Natural Log. The " Log " function in excel actually defaults to a log with a base of 10, so that is where the difference you're seeing is coming from. Sorry for the confusion there! As for the precision issue that you're having, it looks like the field that you are using to calculate the log is an integer field (either short integer or long integer). You will need to create new fields and use either Double or Float, which will then give you the precision that you're looking for.
08152011
Hi Wolfram, The default threshold distance for the GetisOrd Gi* statistic is calculated as the distance at which every feature in the dataset has at least one neighbor. You can also learn more about Hot Spot Analysis here , and you can find a ton of resources about all of the tools in the Spatial Statistics toolbox at http://esriurl.com/spatialstats .
08152011
