OLS Regression Habitat Model

Discussion created by zperdue@gmail.com on Feb 11, 2014
I???m performing some habitat modeling for a certain species using resource selection functions to determine optimal variables for including as model parameters.  I am using OLS within ArcGIS for the regression analysis on variables trying to determine how the species selects for certain variables based on availability in the surrounding environment.

I have a series of known ???presence??? and randomly generated ???available??? sites; the response is binary with ???presence??? as 1???s and ???available??? as 0???s. I have 546 ???presence??? points and 2730 ???available??? points. 

I have run OLS to produce a number of variable combinations that minimize the AIC.  However, once I go in and perform Spatial Autocorrelation (Moran???s Global), the results always indicate the data are clustered. My understanding is that if the Moran???s results indicate clustering, then the variable coefficients, etc. should not be trusted/consulted as it will over/under-predict in some portions of the model.  If that???s the case, how can I arrive at results that are not clustered?

Secondly, when performing OLS, I???m curious if there???s a specific way to remove variables that are either insignificant or highly correlated.  For example, if I run OLS on say 24 variable combinations, perhaps half are returned as significant, and the other half are indicated as insignificant.  In addition, some have very high VIF, while others are low.  My question is, when removing variables to optimize the variable combination, is it best to first remove variables by order of insignificance,  or is it best to eliminate variables with high VIF regardless of insignificance?  In my experience, I???m not sure that there is a ???best??? method, but curious if others have any insight based on their experience. 

Thanks in advance for any help.