Can't seem to find this anywhere. How is the R2 in the output report generated? It's not the mean/median or other statistic describing the range of local R2s that GWR produces. How can you report a single R2 for GWR? Thanks for any help with this, and sorry if it's somewhere obvious, just can't seem to find it!
This is a really good question!! The answer is actually pretty simple. The overall R2 value for GWR is calculated the exact same way that it is calculated in OLS...essentially it's the proportion of the variability explained by the model. For both it is 1 - (Σ(predicted-observed)²/Σ(observed-mean of all observed)²). In other words, its 1 minus the variance of the residuals divided by the variance of the input data. In both OLS and GWR, the residuals are the predicted minus the observed, so it follows that you would expect OLS and GWR to have different R-squared values because those predicted values are calculated using different methods. So, its the same as OLS, the only difference is what values are being used for the predictions.
This is probably way more than you wanted to know, but hopefully it helps! Let me know if you need any more details. 🙂
I'm not sure I understand what you mean by "pooling the residuals" from the local models, but I'll try to clarify.
Basically, both OLS and GWR ultimately calculate a prediction for each feature in the dataset. For OLS that prediction is based on a global model. For GWR that prediction is based on a local model that was calibrated using nearby features. Either way, each feature ends up with a predicted value. The residuals, whether you are talking about OLS or GWR, are just the difference between that predicted value and the observed value.
Once you have those residuals, calculating the R2 value is exactly the same for both methods. The only difference is whether the predicted value was calculated using a global model (OLS) or a local model (GWR).