Using AIC to compare Ordinary Least Squares and Geog. Weighted Regression models

7086
3
Jump to solution
07-19-2010 04:52 AM
SteveDugdale
New Contributor
I have produced some Ordinary Least Squares (OLS) and Geographically Weighted Regression (GWR) regression models and I want to identify which model is better. 

According to http://resources.esri.com/help/9.3/arcgisengine/java/Gp_ToolRef/spatial_statistics_tools/interpretin... "Comparing the GWR AICc value to the OLS AICc value is one way to assess the benefits of moving from a global model (OLS) to a local regression model (GWR)".

However, whereas the OLS tool in ARCMAP 9.3 outputs straightforward AIC (NOT AICc), the GWR tool appears to output AICc (i.e. a "corrected" version of AIC).    According to Fotheringham, A.S., Brunsdon, C. & Charlton, M. (2002) Geographically Weighted Regression - the analysis of spatially varying relationships John Wiley & Sons, Chichester. (page 96) "direct comparisons should NOT be made between AIC and AICc".

I'm therefore wondering how I can validly compare my OLS and GWR models when the diagnostics appear to give me incompatible outputs for that purpose.  I'm rather confused.

Please could someone advise whether the outputs are comparable after all and if not, why have they been "programmed" in this way?  I'm a non-statistician so I don't understand the nuts and bolts of the underlying calculations. 

Any assistance would be appreciated.

Thanks
Steve
0 Kudos
1 Solution

Accepted Solutions
LaurenScott
Occasional Contributor
Hi Steve,
A bit more:
With regard to using AICc to compare non-nested models, please see p 88 of the first citation (Burnham and Anderson, 2002) in my previous post.  It states: �??A substantial advantage in using information-theoretic criteria is that they are valid for nonnested models�?��?�  They go on to state: �??Of course, traditional likelihood ratio tests are defined only for nested models, and this represents another substantial limitation in the use of hypothesis testing in model selection.�?�  It�??s important not to confuse AICc for model selection with traditional likelihood ratio tests for hypothesis testing. 

In any case, even if you are feeling hesitant about comparing AICc for non-nested models, please note that if you follow our guidelines of finding a properly specified OLS model first, then moving to GWR with those same explanatory variables (minus any spatial regime variables), the GWR/OLS models essentially *are* nested: the standard OLS model is a subset of all GWR models�?� just one in which all local coefficients take the same value.  So please feel comfortable comparing OLS AICc to GWR AICc.  Based on the citation above (p 88), I also feel comfortable comparing AICc values for non-nested models (as long as the dependent variable data are identical in all models being compared).

I hope this helps,
Best wishes,
Lauren

View solution in original post

0 Kudos
3 Replies
JeffreyEvans
Occasional Contributor III
I ardently disagree with the assertion that "Comparing the GWR AICc value to the OLS AICc value is one way to assess the benefits of moving from a global model (OLS) to a local regression model (GWR)". AIC assumes nested model variances in order to compare competing models. Because of non-nested variances, there is a controversy in the literature regarding appropriate use of AIC in spatial models. Regardless, YOU CANNOT COMPARE A GWR AND OLS MODEL USING AIC.  AIC is intended for hypothesis testing of competing models and not data mining. A competing model would not be considered OLS vs. GWR, but rather different sub-sets of variables. Given the mathematical differences between GWR and OLS there is not unification between the models, making AIC scores non-comparable.  

The asymptotic justification of AIC requires two strong assumptions:
(1) that the true model is contained in the candidate class under consideration,
(2) that the vector of maximum likelihood estimators satisfies the conventional large-sample properties of maximum likelihood.

Because of the above assumptions, AICc was developed as a modification that accounts for small sample sizes. The reason GWR uses AICc is because it is a local regression that iteratively fits small numbers of observations within a specified bandwidth. Unless you have small sample issues, AICc in an OLS is not appropriate. Some exploratory data analysis (Box-whisker and X,Y plots, Moran's-I for global autocorrelation, LISA for non-stationary, etc...) should provide insight to the appropriate modeling approach and verify that your data is suitable for a spatial model like GWR. If there is no autocorrelation (1st or 2nd order) in your data then OLS is quite well specified.
0 Kudos
LaurenScott
Occasional Contributor
Hi Steve,
Thanks so much for posting this question!  Yes, you can definitely use the Akaike Information Criterion to compare different OLS and GWR models as long as all of the models are based on the same set of dependent variables.  For references please see:

(1) Burnham, K.P. and D.R. Anderson.  2002.  Model Selection and Multimodel Inference: a practical information-theoretic approach, 2nd Edition.  New York: Springer. 

(2) Fotheringham, S. A., C. Brunsdon, and M. Charlton.  2002.  Geographically Weighted Regression: the analysis of spatially varying relationships.  Chichester: Wiley, UK.  In particular section 4.5 regarding model selection.

(3) Hurvich, C., J. Simonoff, and C. Tsai.  1998.  �??Smoothing Parameter Selection in Nonparametric Regression Using an Improved Akaike Information Criterion.�?�  Journal of the Royal Statistical Society. Series B vol. 60 (2) pp. 271-293.

AICc is a variant of AIC that corrects bias associated with small sample sizes.  AICc approaches AIC as the number of features gets larger.  Since we only recommend that you use GWR when you have lots of features (I believe our documentation recommends several hundred features and it specifically says GWR is not appropriate for small datasets), this probably isn�??t a problem for you.  Nonetheless, having a different AIC method for OLS than for GWR is clearly a source of confusion!  Consequently (and as a direct consequence of YOUR post to this forum, btw), with ArcGIS 10 Service Pack 1, OLS will compute both the AIC and the AICc.  AICc will be written to the OLS summary report.  Both AIC and AICc will be written to the optional diagnostic table so you can compare them to see how close they really are.  With GWR we will continue to compute AICc.  Because GWR creates an equation for each feature, the number of features associated with each local equation will be smaller than for the whole dataset.  So it is more important to avoid comparing AIC to AICc with GWR.  (ArcGIS 10 Service Pack 1 should be available for download end of October or first part of November, by the way).

Please keep in mind that the Akaike Information Criterion is a relative measure.  You could come up with a model for some Y variable using various ridiculous candidate explanatory variables, and without doubt, one of those bogus models will have the lowest AICc value.  That doesn�??t mean that any of your models are good 🙂  This is why we strongly recommend that you always start with OLS and make sure you have a properly specified OLS model before moving to GWR.  Once you identify your key explanatory variables in OLS, use those same explanatory variables (minus any spatial regime variables) in GWR.  A drop in the AICc value of more than 3 units indicates that your model has improved by allowing coefficients to vary across the study area. 

Again, thanks for posting your question! 

We finally have our own forum thread, by the way, so if you have other questions relating to the tools in the spatial statistics toolbox, please post them to:  http://forums.arcgis.com/forums/110-Spatial-Statistics 

I hope this information is helpful!
Best wishes,
Lauren Scott, PhD
ESRI
Geoprocessing, Spatial Statistics
0 Kudos
LaurenScott
Occasional Contributor
Hi Steve,
A bit more:
With regard to using AICc to compare non-nested models, please see p 88 of the first citation (Burnham and Anderson, 2002) in my previous post.  It states: �??A substantial advantage in using information-theoretic criteria is that they are valid for nonnested models�?��?�  They go on to state: �??Of course, traditional likelihood ratio tests are defined only for nested models, and this represents another substantial limitation in the use of hypothesis testing in model selection.�?�  It�??s important not to confuse AICc for model selection with traditional likelihood ratio tests for hypothesis testing. 

In any case, even if you are feeling hesitant about comparing AICc for non-nested models, please note that if you follow our guidelines of finding a properly specified OLS model first, then moving to GWR with those same explanatory variables (minus any spatial regime variables), the GWR/OLS models essentially *are* nested: the standard OLS model is a subset of all GWR models�?� just one in which all local coefficients take the same value.  So please feel comfortable comparing OLS AICc to GWR AICc.  Based on the citation above (p 88), I also feel comfortable comparing AICc values for non-nested models (as long as the dependent variable data are identical in all models being compared).

I hope this helps,
Best wishes,
Lauren
0 Kudos