Select to view content in your preferred language

Accessing Regression Results with ArcPy

83
0
Tuesday
NiallDelany_SEAI
New Contributor

I'm trying to script several regressions (GLR) back to back using ArcPy and assess the p-values of each run. Using the getMessages() method on the result I can return the full message printout (see below) which contains the significance value, but I'm having trouble finding a method that will parse the text and extract the precise p-value.

Can anyone offer any advice on how to script the extraction of the Robust_Prb  value for the SIM2_NORM variable?

'Start Time: Tuesday 18 June 2024 08:13:44\njson:\n[{"element": "table", "data": [["Variable", ["Coefficient", {"element": "sup", "data": "a"}], "StdError", "t-Statistic", ["Probability", {"element": "sup", "data": "b"}], "Robust_SE", "Robust_t", ["Robust_Pr", {"element": "sup", "data": "b"}]], ["Intercept", "0.595155", "0.023640", "25.175822", ["0.000000", {"element": "sup", "data": "*"}], "0.022500", "26.451563", ["0.000000", {"element": "sup", "data": "*"}]], ["SIM2_NORM", "-0.102146", "0.047075", "-2.169875", ["0.030479", {"element": "sup", "data": "*"}], "0.049163", "-2.077722", ["0.038240", {"element": "sup", "data": "*"}]]], "elementProps": {"striped": "true", "title": "Summary of GLR Results [Model Type: Continuous (Gaussian/OLS)]", "0": {"align": "right", "pad": "0px", "wrap": true}, "1": {"align": "right", "pad": "0px", "wrap": true}, "2": {"align": "right", "pad": "0px", "wrap": true}, "3": {"align": "right", "pad": "0px", "wrap": true}, "4": {"align": "right", "pad": "0px", "wrap": true}, "5": {"align": "right", "pad": "0px", "wrap": true}, "6": {"align": "right", "pad": "0px", "wrap": true}, "7": {"align": "right", "pad": "0px", "wrap": true}}}]\njson:\n[{"element": "table", "data": [["Input Features", "RANDOM_SAMPLE_POINTS_Layer7", "  Dependent Variable", "SIM1_NORM "], ["Number of Observations", "496", ["  Akaike\'s Information Criterion (AICc)", {"element": "sup", "data": "d"}], "-402.173053 "], [["Multiple R-Squared", {"element": "sup", "data": "d"}], "0.009441", ["  Adjusted R-Squared", {"element": "sup", "data": "d"}], "0.007436 "], [["Joint F-Statistic", {"element": "sup", "data": "e"}], "4.708358", "  Prob(>F), (1,494) degrees of freedom", "0.096036 "], [["Joint Wald Statistic", {"element": "sup", "data": "e"}], "4.316927", "  Prob(>chi-squared), (1) degrees of freedom", ["0.037735", {"element": "sup", "data": "*"}]], [["Koenker (BP) Statistic", {"element": "sup", "data": "f"}], "18.432843", "  Prob(>chi-squared), (1) degrees of freedom", ["0.000018", {"element": "sup", "data": "*"}]], [["Jarque-Bera Statistic", {"element": "sup", "data": "g"}], "5.001944", "  Prob(>chi-squared), (2) degrees of freedom", "0.082005 "]], "elementProps": {"striped": "true", "noHeader": true, "title": "GLR Diagnostics", "0": {"align": "left", "pad": "0px", "wrap": true}, "1": {"align": "right", "pad": "0px", "wrap": true}, "2": {"align": "left", "pad": "0px", "wrap": true}, "3": {"align": "right", "pad": "0px", "wrap": true}}}]\njson:\n[{"element": "table", "data": [["*", "An asterisk next to a number indicates a statistically significant p-value (p < 0.01)."], ["a", "Coefficient: Represents the strength and type of relationship between each explanatory variable and the dependent variable."], ["b", "Probability and Robust Probability (Robust_Pr): Asterisk (*) indicates a coefficient is statistically significant (p < 0.01); if the Koenker (BP) Statistic [f] is statistically significant, use the Robust Probability column (Robust_Pr) to determine coefficient significance."], ["c", "Variance Inflation Factor (VIF): Large Variance Inflation Factor (VIF) values (> 7.5) indicate redundancy among explanatory variables."], ["d", "R-Squared and Akaike\'s Information Criterion (AICc): Measures of model fit/performance."], ["e", "Joint F and Wald Statistics: Asterisk (*) indicates overall model significance (p < 0.01); if the Koenker (BP) Statistic [f] is statistically significant, use the Wald Statistic to determine overall model significance."], ["f", "Koenker (BP) Statistic: When this test is statistically significant (p < 0.01), the relationships modeled are not consistent (either due to non-stationarity or heteroskedasticity).  You should rely on the Robust Probabilities (Robust_Pr) to determine coefficient significance and on the Wald Statistic to determine overall model significance."], ["g", "Jarque-Bera Statistic: When this test is statistically significant (p < 0.01) model predictions are biased (the residuals are not normally distributed)."]], "elementProps": {"striped": "true", "noHeader": true, "title": "Notes on Interpretation", "0": {"align": "center", "pad": "0px", "wrap": true}, "1": {"align": "left", "pad": "0px", "wrap": true}}}]\nSucceeded at Tuesday 18 June 2024 08:13:49 (Elapsed Time: 5.30 seconds)'

Messages

 
Start Time: Tuesday 18 June 2024 08:13:44
Summary of GLR Results [Model Type: Continuous (Gaussian/OLS)]
Variable Coefficienta StdError t-Statistic Probabilityb Robust_SE Robust_t Robust_Prb
Intercept0.5951550.02364025.1758220.000000*0.02250026.4515630.000000*
SIM2_NORM-0.1021460.047075-2.1698750.030479*0.049163-2.0777220.038240*
GLR Diagnostics
Input FeaturesRANDOM_SAMPLE_POINTS_Layer7Dependent VariableSIM1_NORM
Number of Observations496Akaike's Information Criterion (AICc)d-402.173053
Multiple R-Squaredd0.009441Adjusted R-Squaredd0.007436
Joint F-Statistice4.708358Prob(>F), (1,494) degrees of freedom0.096036
Joint Wald Statistice4.316927Prob(>chi-squared), (1) degrees of freedom0.037735*
Koenker (BP) Statisticf18.432843Prob(>chi-squared), (1) degrees of freedom0.000018*
Jarque-Bera Statisticg5.001944Prob(>chi-squared), (2) degrees of freedom0.082005
Notes on Interpretation
*An asterisk next to a number indicates a statistically significant p-value (p < 0.01).
aCoefficient: Represents the strength and type of relationship between each explanatory variable and the dependent variable.
bProbability and Robust Probability (Robust_Pr): Asterisk (*) indicates a coefficient is statistically significant (p < 0.01); if the Koenker (BP) Statistic [f] is statistically significant, use the Robust Probability column (Robust_Pr) to determine coefficient significance.
cVariance Inflation Factor (VIF): Large Variance Inflation Factor (VIF) values (> 7.5) indicate redundancy among explanatory variables.
dR-Squared and Akaike's Information Criterion (AICc): Measures of model fit/performance.
eJoint F and Wald Statistics: Asterisk (*) indicates overall model significance (p < 0.01); if the Koenker (BP) Statistic [f] is statistically significant, use the Wald Statistic to determine overall model significance.
fKoenker (BP) Statistic: When this test is statistically significant (p < 0.01), the relationships modeled are not consistent (either due to non-stationarity or heteroskedasticity). You should rely on the Robust Probabilities (Robust_Pr) to determine coefficient significance and on the Wald Statistic to determine overall model significance.
gJarque-Bera Statistic: When this test is statistically significant (p < 0.01) model predictions are biased (the residuals are not normally distributed).
0 Kudos
0 Replies