Select to view content in your preferred language

Ordinary Least Squares (OLS) and Geographically Weighted Regression (GWR) woes

19803
27
Jump to solution
05-07-2010 03:31 PM
BrianWallace
New Contributor
Hello forum community,

I am a graduate student charged with the task of exploring the potentiality of predicting archaeological "site" or more accurately artifact location based off of an existing archaeological site database as my thesis exercise.  The task is to attempt to find any correlational relationships between the existing site location dependent variable and environmental indpendent variables.  I have looked into various different logistic, multivariate, regression models but have been reading up on the relatively recently released tools of OLS and GWR in the Spatial Statistics toolbox.  Needless to say I have begun working with the data using these regression tools but am quickly feeling a bit overwhelmed and fear spending a significant amount of time towards a goal that may not even be possible with this method. 

I have watched the free web seminar a couple of times, have read;

The ESRI Guide to GIS Analysis, Volume 2
Mitchell, Andy. ESRI Press, 2005.
Geographically Weighted Regression: the analysis of spatially varying relationships
Fotheringham, Stewart A., Chris Brunsdon, and Martin Charlton. John Wiley & Sons, 2002.

And thought I was making progress however I am finding very little R-squared "goodness of fit" value in every attempt of analysis and am feeling as though I am spinning circles.  This forum was suggested to me by the regional ESRI tech support as she too admitted as to knowing very little about the application.  I have looked into attending an ESRI class "Performing Analysis with ArcGIS Desktop" which was suggested again after the webinar but before pulling the trigger on spending that kind of scratch I am hoping someone can shed some light as to whether I am barking up the wrong tree? 

Could OLS and GWR successfully analyze the relationship between environmental (and perhaps other explanatory systemic behavioral) variables and archaelogical artifact locations?  I vaguely recall in perhaps the webinar or perhaps Mitchell's chapter regarding this material the potentiality of success but again after early attempts just want to make sure before continuing the pursuit.  I love the theory behind using local statistics to assess relationships on the dependent variable and the ability to formulate a model which can be later used to help predict unknown areas but at least with my own study have hit the preverbial wall. 

Can anyone help?

Thanks- aspiring graduate school graduate
0 Kudos
27 Replies
MichaelMcManus
Deactivated User
I have two other questions about OLS in ArcMap.

1.  Is the Kenker (BP) statistic reported indicating biased standard errors the applying the Breusch-Pagan test for heteroskedasticity?  I think it is, but can't find any documentation.

2.  What method is being used to produce the robust standard error estimates?

Thanks,
Mike
0 Kudos
LaurenRosenshein
Regular Contributor
Hi Mike,

Two more great questions!  🙂  First, you are absolutely right that the Koenker (BP) Statistic is a Breusch-Pagan test for heteroscedasticity.  As far as the robust probabilities are concerned, those are calculated using White's heteroscedasticity-consistent standard errors.

Hope this helps!

Lauren Rosenshein
Geoprocessing Product Engineer
0 Kudos
MichaelMcManus
Deactivated User
Hi Lauren,

Is this the citation for White's heteroscedasticity-consistent standard errors that is being used in producing the robust standard errors for the OLS regression:  White, H. (1980), "A Heteroscedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroscedasticity," Econometrica, 48, 817-838?

I am working on a manuscript for a journal article so I need to know how to cite the output produced by OLS regression in ArcMap.
Thanks,
Mike
0 Kudos
LaurenRosenshein
Regular Contributor
Yup, that's the one!

Lauren
0 Kudos
MuhammadBilal1
Emerging Contributor
I found non-linear relationship between the variables.I want to transform using Log and Exponential transformations but i did not fined these transformation tools in the ArcGIS . Would you please guide me to find these transformations.

Regards,
Bilal
0 Kudos
MichaelMcManus
Deactivated User
Hi Lauren,

Just checking, but the "StdResid" returned with the OLS output are they standardized residuals or Studentized residuals?  I opened up the code, which is nice, and saw:

self.stdRedisuals = e / self.seResiduals.

and seResiduals=sqrt(s2)

That looks like a standardized residual as the denominator does not contain the multiplicant of the square root of 1 minus hii.  Both Weisberg (1985) and Ramsay and Schafer (2002) show the equation for a studentized residual to be:
studresi = resi/(sigma hat*sqrt(1- hii))

Thanks,
Mike
0 Kudos
LaurenScott
Deactivated User
Hi Mike,
Yes, it's Standardized Residual.
Thanks for pointing out that this isn't documented!  We will get this corrected.
Best wishes,
Lauren Scott
Esri
Geoprocessing, Spatial Statistics
0 Kudos
LaurenScott
Deactivated User
Hi Bilal,
You can add a new field to your table.  Right click on that field and select Field Calculator.  You can use the calculator to populate the new field with the log of the original values or the original values raised to whatever exponent you think works best.  The Add Field and Calculate Field tools can also be used. 
I hope this helps,
Lauren Scott
Esri
0 Kudos
BethanyPratt
New Contributor
Hello,
I have a question about how ArcMap reports the GWR R-squared results -

The results dialogue, as well as the _supp file created, report an R2 and adjusted R2 value.  Where is this coming from?  It is not the average R2 result for the output shapefile's local R2, in fact it's quite a bit higher.

Bethany
0 Kudos
KarenKemp
Emerging Contributor
Following up on the earlier discussion about dummy (categorical) variables, a student of mine has a set of binary variables that are not mutually exclusive - i.e. they represent a series of yes/no answers on a set of questions, rather than a means of dividing the data into sets as with categorical variables. Can these be used in the linear OLS and GWR analyses along with a few other normal numeric variables? I don't suppose there's a way to do a logit version?
0 Kudos