I'm trying to script several regressions (GLR) back to back using ArcPy and assess the p-values of each run. Using the getMessages() method on the result I can return the full message printout (see below) which contains the significance value, but I'm having trouble finding a method that will parse the text and extract the precise p-value.
Can anyone offer any advice on how to script the extraction of the Robust_Prb value for the SIM2_NORM variable?
'Start Time: Tuesday 18 June 2024 08:13:44\njson:\n[{"element": "table", "data": [["Variable", ["Coefficient", {"element": "sup", "data": "a"}], "StdError", "t-Statistic", ["Probability", {"element": "sup", "data": "b"}], "Robust_SE", "Robust_t", ["Robust_Pr", {"element": "sup", "data": "b"}]], ["Intercept", "0.595155", "0.023640", "25.175822", ["0.000000", {"element": "sup", "data": "*"}], "0.022500", "26.451563", ["0.000000", {"element": "sup", "data": "*"}]], ["SIM2_NORM", "-0.102146", "0.047075", "-2.169875", ["0.030479", {"element": "sup", "data": "*"}], "0.049163", "-2.077722", ["0.038240", {"element": "sup", "data": "*"}]]], "elementProps": {"striped": "true", "title": "Summary of GLR Results [Model Type: Continuous (Gaussian/OLS)]", "0": {"align": "right", "pad": "0px", "wrap": true}, "1": {"align": "right", "pad": "0px", "wrap": true}, "2": {"align": "right", "pad": "0px", "wrap": true}, "3": {"align": "right", "pad": "0px", "wrap": true}, "4": {"align": "right", "pad": "0px", "wrap": true}, "5": {"align": "right", "pad": "0px", "wrap": true}, "6": {"align": "right", "pad": "0px", "wrap": true}, "7": {"align": "right", "pad": "0px", "wrap": true}}}]\njson:\n[{"element": "table", "data": [["Input Features", "RANDOM_SAMPLE_POINTS_Layer7", " Dependent Variable", "SIM1_NORM "], ["Number of Observations", "496", [" Akaike\'s Information Criterion (AICc)", {"element": "sup", "data": "d"}], "-402.173053 "], [["Multiple R-Squared", {"element": "sup", "data": "d"}], "0.009441", [" Adjusted R-Squared", {"element": "sup", "data": "d"}], "0.007436 "], [["Joint F-Statistic", {"element": "sup", "data": "e"}], "4.708358", " Prob(>F), (1,494) degrees of freedom", "0.096036 "], [["Joint Wald Statistic", {"element": "sup", "data": "e"}], "4.316927", " Prob(>chi-squared), (1) degrees of freedom", ["0.037735", {"element": "sup", "data": "*"}]], [["Koenker (BP) Statistic", {"element": "sup", "data": "f"}], "18.432843", " Prob(>chi-squared), (1) degrees of freedom", ["0.000018", {"element": "sup", "data": "*"}]], [["Jarque-Bera Statistic", {"element": "sup", "data": "g"}], "5.001944", " Prob(>chi-squared), (2) degrees of freedom", "0.082005 "]], "elementProps": {"striped": "true", "noHeader": true, "title": "GLR Diagnostics", "0": {"align": "left", "pad": "0px", "wrap": true}, "1": {"align": "right", "pad": "0px", "wrap": true}, "2": {"align": "left", "pad": "0px", "wrap": true}, "3": {"align": "right", "pad": "0px", "wrap": true}}}]\njson:\n[{"element": "table", "data": [["*", "An asterisk next to a number indicates a statistically significant p-value (p < 0.01)."], ["a", "Coefficient: Represents the strength and type of relationship between each explanatory variable and the dependent variable."], ["b", "Probability and Robust Probability (Robust_Pr): Asterisk (*) indicates a coefficient is statistically significant (p < 0.01); if the Koenker (BP) Statistic [f] is statistically significant, use the Robust Probability column (Robust_Pr) to determine coefficient significance."], ["c", "Variance Inflation Factor (VIF): Large Variance Inflation Factor (VIF) values (> 7.5) indicate redundancy among explanatory variables."], ["d", "R-Squared and Akaike\'s Information Criterion (AICc): Measures of model fit/performance."], ["e", "Joint F and Wald Statistics: Asterisk (*) indicates overall model significance (p < 0.01); if the Koenker (BP) Statistic [f] is statistically significant, use the Wald Statistic to determine overall model significance."], ["f", "Koenker (BP) Statistic: When this test is statistically significant (p < 0.01), the relationships modeled are not consistent (either due to non-stationarity or heteroskedasticity). You should rely on the Robust Probabilities (Robust_Pr) to determine coefficient significance and on the Wald Statistic to determine overall model significance."], ["g", "Jarque-Bera Statistic: When this test is statistically significant (p < 0.01) model predictions are biased (the residuals are not normally distributed)."]], "elementProps": {"striped": "true", "noHeader": true, "title": "Notes on Interpretation", "0": {"align": "center", "pad": "0px", "wrap": true}, "1": {"align": "left", "pad": "0px", "wrap": true}}}]\nSucceeded at Tuesday 18 June 2024 08:13:49 (Elapsed Time: 5.30 seconds)'
Messages
Intercept | 0.595155 | 0.023640 | 25.175822 | 0.000000* | 0.022500 | 26.451563 | 0.000000* |
SIM2_NORM | -0.102146 | 0.047075 | -2.169875 | 0.030479* | 0.049163 | -2.077722 | 0.038240* |
Input Features | RANDOM_SAMPLE_POINTS_Layer7 | Dependent Variable | SIM1_NORM |
Number of Observations | 496 | Akaike's Information Criterion (AICc)d | -402.173053 |
Multiple R-Squaredd | 0.009441 | Adjusted R-Squaredd | 0.007436 |
Joint F-Statistice | 4.708358 | Prob(>F), (1,494) degrees of freedom | 0.096036 |
Joint Wald Statistice | 4.316927 | Prob(>chi-squared), (1) degrees of freedom | 0.037735* |
Koenker (BP) Statisticf | 18.432843 | Prob(>chi-squared), (1) degrees of freedom | 0.000018* |
Jarque-Bera Statisticg | 5.001944 | Prob(>chi-squared), (2) degrees of freedom | 0.082005 |
* | An asterisk next to a number indicates a statistically significant p-value (p < 0.01). |
a | Coefficient: Represents the strength and type of relationship between each explanatory variable and the dependent variable. |
b | Probability and Robust Probability (Robust_Pr): Asterisk (*) indicates a coefficient is statistically significant (p < 0.01); if the Koenker (BP) Statistic [f] is statistically significant, use the Robust Probability column (Robust_Pr) to determine coefficient significance. |
c | Variance Inflation Factor (VIF): Large Variance Inflation Factor (VIF) values (> 7.5) indicate redundancy among explanatory variables. |
d | R-Squared and Akaike's Information Criterion (AICc): Measures of model fit/performance. |
e | Joint F and Wald Statistics: Asterisk (*) indicates overall model significance (p < 0.01); if the Koenker (BP) Statistic [f] is statistically significant, use the Wald Statistic to determine overall model significance. |
f | Koenker (BP) Statistic: When this test is statistically significant (p < 0.01), the relationships modeled are not consistent (either due to non-stationarity or heteroskedasticity). You should rely on the Robust Probabilities (Robust_Pr) to determine coefficient significance and on the Wald Statistic to determine overall model significance. |
g | Jarque-Bera Statistic: When this test is statistically significant (p < 0.01) model predictions are biased (the residuals are not normally distributed). |