Select to view content in your preferred language

Geographically and Temporally Regression (GTWR), the p-value of Local regression coefficients

173
4
Sunday
ZhichengZhong
New Contributor

Hi! I have recently been puzzled by the p-values of the local regression coefficients of the GTWR model, which are rarely mentioned in related research publications, where people only report the R-squared, Ajusted R-squared, and AICc of the OLS, GWR, TWR, and GTWR to show how well the models are fitted. However, when the temporal variation of the local regression coefficients is discussed, there is no mention of the P-value, which seems to be difficult to account for the explanatory power of the local regression coefficients. In addition, in the GTWR model, is it necessary to satisfy that the p-value of the regression coefficients of all explanatory variables at each observation is less than 0.05. When the p-value of an explanatory variable in the model is less than 0.05 at only a few observations, but the model as a whole fits very well, and the model fits worse after deleting the explanatory variable, should the variable be deleted or not? Is it reasonable for some studies to set the proportion of observations with a p-value of less than 0.05 at more than 50% to consider the model as having a strong explanatory effect, and to discuss the local regression coefficients in this way? Hope someone can help with the question, thanks a lot. 😊

0 Kudos
4 Replies
EdwardGause
New Contributor III
Since Time is just an added dimension, it does not change the fact that your Explanatory Variable need to be significantly significant. Remember that OLS is a multivariable linear regression, which means it is multi-dimensional in a sense of each Explanatory Variable is on its own scale dimensionally, which doesn't have to be space and time but could be. So, yes, your p-values should still be checked for the Explanatory Variables. For you explanatory variable that is not Significantly significant, maybe you can find another variable that it shares a high Multicollinearity with your variable since these variables share the same story mathematically with your variable. Obviously, it would need to be a different variable that makes sense, but the list of variables that share high Multicollinearity with your variable that is not significant may produce one that is significant and makes sense in the model.
Another thought, verify that there is a linear relationship between your in-significant variable and the Dependent, and not some quadratic relationship or other curve relationship. Maybe you need to do a transformation to flatten the relationship to linear before you put it into your linear equation. ESRI has the built-in Data Engineering Tool in ArcGIS Pro that can do transformations on variables, and you can create a new column when you do that.
Disclaimer: I have never used their GTWR tool, but I have used OLS, GWR, and Exploratory Regression and built my own Multivariable Linear Regression using Matrix Math and a programming language.

Robert “Edward” Gause, GISP | Director of Information Services | HTC | p 843-369-8483 | www.htcinc.net | This is life. Connet with it.
ZhichengZhong
New Contributor

Thank you very much for your reply. I actually performed the construction of the OLS regression and verified the linear relationship before performing the GTWR regression, and then transformed the selected explanatory variables to ensure a better linear relationship. However, except for the explanatory variables having high Beta values which have good significance of local regression coefficients across observations in the GTWR model, the other explanatory variables having low Beta values have very poor significance of local regression coefficients across observations, but the adjusted R-squared and the AICc of the GTWR model are very good, so are these results usable for discussion? My aim is to compare the effects of the same set of explanatory variables on two dependent variables, the first of which has achieved the desired effect, but the results of the model for the second dependent variable produce the above problem. This already includes explanatory variables that are simultaneously linear and significant for both dependent variables. 

0 Kudos
EricKrause
Esri Regular Contributor

Hi @ZhichengZhong,

ArcGIS does not have a GTWR tool, but in principle, including time should not alter the question of if/how to test for significant explanatory variables.  While the GWR tools does provide significance results for local models, it's understandable that other softwares and publications do not.  The problem is that GW(T)R isn't really a single model: it is a collection of local models that are each estimated at the locations of the input features.  Further, these models are correlated with each other, as they often share the same features in their neighborhood.  This can create problems related to multiple hypothesis testing where you are effectively testing N times the number of explanatory variables, and you should definitely be cautious in interpreting any particular p-value when performing so many hypothesis tests.  This is why, generally, explanatory variables are chosen using global models like OLS, and statistics like R-squared and AIC are used to determine how much better GW(T)R does compared to the global model.

ZhichengZhong
New Contributor

Many thanks for your reply! I'm confused because they visualize all the local regression coefficients spatially and even discuss time trends, but shouldn't so much of that discussion be predicated on the fact that all the local regression coefficients satisfy the significance test? Their practice of not showing p-values seems non-transparent. What if some observations are not significant? Would it make sense to set a significance ratio similar to the one I mentioned? 

0 Kudos