Accessing Regression Results with ArcPy

NiallDelany_SEAI · ‎06-18-2024

I'm trying to script several regressions (GLR) back to back using ArcPy and assess the p-values of each run. Using the getMessages() method on the result I can return the full message printout (see below) which contains the significance value, but I'm having trouble finding a method that will parse the text and extract the precise p-value.

Can anyone offer any advice on how to script the extraction of the Robust_Prb value for the SIM2_NORM variable?

'Start Time: Tuesday 18 June 2024 08:13:44\njson:\n[{"element": "table", "data": [["Variable", ["Coefficient", {"element": "sup", "data": "a"}], "StdError", "t-Statistic", ["Probability", {"element": "sup", "data": "b"}], "Robust_SE", "Robust_t", ["Robust_Pr", {"element": "sup", "data": "b"}]], ["Intercept", "0.595155", "0.023640", "25.175822", ["0.000000", {"element": "sup", "data": "*"}], "0.022500", "26.451563", ["0.000000", {"element": "sup", "data": "*"}]], ["SIM2_NORM", "-0.102146", "0.047075", "-2.169875", ["0.030479", {"element": "sup", "data": "*"}], "0.049163", "-2.077722", ["0.038240", {"element": "sup", "data": "*"}]]], "elementProps": {"striped": "true", "title": "Summary of GLR Results [Model Type: Continuous (Gaussian/OLS)]", "0": {"align": "right", "pad": "0px", "wrap": true}, "1": {"align": "right", "pad": "0px", "wrap": true}, "2": {"align": "right", "pad": "0px", "wrap": true}, "3": {"align": "right", "pad": "0px", "wrap": true}, "4": {"align": "right", "pad": "0px", "wrap": true}, "5": {"align": "right", "pad": "0px", "wrap": true}, "6": {"align": "right", "pad": "0px", "wrap": true}, "7": {"align": "right", "pad": "0px", "wrap": true}}}]\njson:\n[{"element": "table", "data": [["Input Features", "RANDOM_SAMPLE_POINTS_Layer7", "  Dependent Variable", "SIM1_NORM "], ["Number of Observations", "496", ["  Akaike\'s Information Criterion (AICc)", {"element": "sup", "data": "d"}], "-402.173053 "], [["Multiple R-Squared", {"element": "sup", "data": "d"}], "0.009441", ["  Adjusted R-Squared", {"element": "sup", "data": "d"}], "0.007436 "], [["Joint F-Statistic", {"element": "sup", "data": "e"}], "4.708358", "  Prob(>F), (1,494) degrees of freedom", "0.096036 "], [["Joint Wald Statistic", {"element": "sup", "data": "e"}], "4.316927", "  Prob(>chi-squared), (1) degrees of freedom", ["0.037735", {"element": "sup", "data": "*"}]], [["Koenker (BP) Statistic", {"element": "sup", "data": "f"}], "18.432843", "  Prob(>chi-squared), (1) degrees of freedom", ["0.000018", {"element": "sup", "data": "*"}]], [["Jarque-Bera Statistic", {"element": "sup", "data": "g"}], "5.001944", "  Prob(>chi-squared), (2) degrees of freedom", "0.082005 "]], "elementProps": {"striped": "true", "noHeader": true, "title": "GLR Diagnostics", "0": {"align": "left", "pad": "0px", "wrap": true}, "1": {"align": "right", "pad": "0px", "wrap": true}, "2": {"align": "left", "pad": "0px", "wrap": true}, "3": {"align": "right", "pad": "0px", "wrap": true}}}]\njson:\n[{"element": "table", "data": [["*", "An asterisk next to a number indicates a statistically significant p-value (p < 0.01)."], ["a", "Coefficient: Represents the strength and type of relationship between each explanatory variable and the dependent variable."], ["b", "Probability and Robust Probability (Robust_Pr): Asterisk (*) indicates a coefficient is statistically significant (p < 0.01); if the Koenker (BP) Statistic [f] is statistically significant, use the Robust Probability column (Robust_Pr) to determine coefficient significance."], ["c", "Variance Inflation Factor (VIF): Large Variance Inflation Factor (VIF) values (> 7.5) indicate redundancy among explanatory variables."], ["d", "R-Squared and Akaike\'s Information Criterion (AICc): Measures of model fit/performance."], ["e", "Joint F and Wald Statistics: Asterisk (*) indicates overall model significance (p < 0.01); if the Koenker (BP) Statistic [f] is statistically significant, use the Wald Statistic to determine overall model significance."], ["f", "Koenker (BP) Statistic: When this test is statistically significant (p < 0.01), the relationships modeled are not consistent (either due to non-stationarity or heteroskedasticity).  You should rely on the Robust Probabilities (Robust_Pr) to determine coefficient significance and on the Wald Statistic to determine overall model significance."], ["g", "Jarque-Bera Statistic: When this test is statistically significant (p < 0.01) model predictions are biased (the residuals are not normally distributed)."]], "elementProps": {"striped": "true", "noHeader": true, "title": "Notes on Interpretation", "0": {"align": "center", "pad": "0px", "wrap": true}, "1": {"align": "left", "pad": "0px", "wrap": true}}}]\nSucceeded at Tuesday 18 June 2024 08:13:49 (Elapsed Time: 5.30 seconds)'

Messages

Start Time: Tuesday 18 June 2024 08:13:44

Summary of GLR Results [Model Type: Continuous (Gaussian/OLS)]

Variable Coefficienta StdError t-Statistic Probabilityb Robust_SE Robust_t Robust_Prb

Intercept	0.595155	0.023640	25.175822	0.000000*	0.022500	26.451563	0.000000*
SIM2_NORM	-0.102146	0.047075	-2.169875	0.030479*	0.049163	-2.077722	0.038240*

GLR Diagnostics

Input Features	RANDOM_SAMPLE_POINTS_Layer7	Dependent Variable	SIM1_NORM
Number of Observations	496	Akaike's Information Criterion (AICc)d	-402.173053
Multiple R-Squaredd	0.009441	Adjusted R-Squaredd	0.007436
Joint F-Statistice	4.708358	Prob(>F), (1,494) degrees of freedom	0.096036
Joint Wald Statistice	4.316927	Prob(>chi-squared), (1) degrees of freedom	0.037735*
Koenker (BP) Statisticf	18.432843	Prob(>chi-squared), (1) degrees of freedom	0.000018*
Jarque-Bera Statisticg	5.001944	Prob(>chi-squared), (2) degrees of freedom	0.082005

Notes on Interpretation

*	An asterisk next to a number indicates a statistically significant p-value (p < 0.01).
a	Coefficient: Represents the strength and type of relationship between each explanatory variable and the dependent variable.
b	Probability and Robust Probability (Robust_Pr): Asterisk (*) indicates a coefficient is statistically significant (p < 0.01); if the Koenker (BP) Statistic [f] is statistically significant, use the Robust Probability column (Robust_Pr) to determine coefficient significance.
c	Variance Inflation Factor (VIF): Large Variance Inflation Factor (VIF) values (> 7.5) indicate redundancy among explanatory variables.
d	R-Squared and Akaike's Information Criterion (AICc): Measures of model fit/performance.
e	Joint F and Wald Statistics: Asterisk (*) indicates overall model significance (p < 0.01); if the Koenker (BP) Statistic [f] is statistically significant, use the Wald Statistic to determine overall model significance.
f	Koenker (BP) Statistic: When this test is statistically significant (p < 0.01), the relationships modeled are not consistent (either due to non-stationarity or heteroskedasticity). You should rely on the Robust Probabilities (Robust_Pr) to determine coefficient significance and on the Wald Statistic to determine overall model significance.
g	Jarque-Bera Statistic: When this test is statistically significant (p < 0.01) model predictions are biased (the residuals are not normally distributed).