Hello All. I'm in serious need of assistance trying to find a properly specified OLS model. My goal: produce model to predict probability areas for archaeological sites. I have compiled over 20 independent variables (aspect, elevation, soils, cost surfaces to water, vegetation, etc.). The highest Adjusted R-squared value I have been able to achieve is .72; however, all of Koenker, Wald, Join-F and Jarque-Bera are statistically significant. All of my VIF values are under 7.5.
I then created a random sample of non-sites to incorporate into the analysis, and my independent variable is "site_present" (0 for site not present; 1 for site present). Upon doing so my Jarque-Bera is now good but I can no longer get a good adjusted R-squared.
Archaeological site locations are non-stationary and do significantly cluster, but I'm all out of brain power to figure out what's wrong.
Would anyone have any advice or pointers to get this moving in the right direction?
Thank you in advance