POST

Hi Ms. Rosenshein, Thanks for the response, much appreciate the time looking at my problem. You wrote... Hey Andrew, To determine where the problem is, run the model using OLS and examine the VIF value for each explanatory variable. If some of the VIF values are large (above 7.5, for example), global multicollinearity is preventing GWR from solving. Yup. Did this first. nothing above 1.99 for VIF. I had already dropped a number of variables because of high VIFs. But "Income" stands alone, as does "population density"  two of the variables I'm having the most issues with. More likely, however, local multicollinearity is the problem. Try creating a thematic map for each explanatory variable. If the map reveals spatial clustering of identical values, consider combining those variables with other explanatory variables to increase value variation. Yes, did this too. And ran LISA and Moran's I measures for all variables to detect for spatial autocorrelation. I see the idea behind combining some variables but then the model loses specificity in explanatory power (and given the VIFs aren't indicating a model issue I'm stumped somewhat.) For example, in Baltimore City, the area in question, there is high local correlation of "percent black" and "percent living in poverty". However, clearly the two variables are separate "entities" and collapsing them loses far too much explanatory utility to do so. Again, it's not these variables that are causing the model to fail structurally I suspect... Another option is to try transforming it (although not in the traditional sense of logs or powers): create a new field, then calculate the values to be the value (in this case the log) minus the mean for all values in that field. This doesn�??t actually change anything (the impact on results), but for some reason we've found that GWR likes variables in that form�?� and this transformation will often �??fix�?� the problem. I 'm thinking this might be the way to go. There is something in the numerical distribution of the variable values that seems to be problematic and causes the equations to falter at the immediate, local, rendering of the GWR. Log transformations seem problematic when untransformed values do work and the transformed don't. I'll check that the transformation follows the assumptions of the variables in questions, though transforming income to a log value is pretty standard I thought? So it still makes me ask what about the combination of equation values could be causing it to fail? (in particular it's the openended (supposedly) distribution of "Income" and "population density" transformed to log values that appears the most problematic in the local GWR executions) Also, just a reminder to make sure that you find a properly specified OLS model before moving on to GWR. There is some great documentation about this, including this recent ArcUser article on Finding a Meaningful Model and the training seminar on Regression Analysis Basics. Thanks for posting this stuff  I'd read the model specification stuff earlier, to be sure I was on the right track still, but the "Finding a Meaningful Model" piece will be good to check my model against core assumptions again ( BTW, it's a nice, simple, elegant piece  well done and thanks for posting/sharing it) . I've gone so far in my model estimations to use the spatial modeling tools for checking sills, nuggets and so on to get a properly determined spatial model, banding distances etc. And I also, earlier, went through and ran a global analysis for Pearsons correlations, running one variable against the next spatially, to determine if there are spurious relationships (mediating and moderating factors) that could be skewing variable values as well. Thanks for the input and suggestions  will see where they take me. Clearly spatial analysis remains in the realm of "sausage making statistics"  not pretty to know what goes in but looks good on the other side, lol. Best Andrew
... View more
07262011
09:13 PM

0

0

24

POST

Hi Jochen, First "hello from Baltimore"  Andr3w T1ml3ck here (was in your class at UMD with Maurice C.)... Not sure if I can help at all, been messing with this stuff too.... In Fischer & Getis' Handbook of Applied Spatial Analysis (2010) Wheeler & Paez have a chapter on what works and what doesn't in GWR (see esp. pp 486469). In it they note that bandwith and number of neighbors selection can be highly problematic  too many neighbors, too far a reach and of course you get no spatial variability. Too few neighbors, too close and you end up with spatial autocorrelation and you get wild, local, swings in regression coefficients  which sounds like what you're describing. Cheers, Andrew (Andy) Clearly, if the bandwidth is such as to include a large number of observations, there will be relatively little or no spatial variation in the coefficients, and if the bandwidth is small, there will potentially be large amounts of variation. A natural concern emerges that some variation or smoothness in the pattern of estimated coefficients may be artificially introduced by the technique and may not represent true regression effects. This situation is at the heart of the discussion about the utility of GWR for inference on regression coefficients and is not answered by existing statistical (Leung et al. 2000a) or Monte Carlo (Fother ingham et al. 2002) tests for significant variation of GWR coefficients because these tests do not consider the source of the variation. This is important because one source of regression coefficient variability in GWR can come from collinear ity, or dependence in the kernelweighted design matrix. Collinearity is known in linear models to inflate the variances of regression coefficients (Neter et al. 1996), and GWR is no exception (Griffith 2008). Collinearity has been found in empiri cal work to be an issue in GWR models at the local level when it is not present in the global linear regression model using the same data (Wheeler 2007). In addition to large variation of estimated regression coefficients, there can be strong depend ence in GWR coefficients for different regression terms, including the intercept, at least partly attributable to collinearity. Wheeler and Tiefelsdorf (2005) show in a simulation study that while GWR coefficients can be correlated when there is no explanatory variable correlation, the coefficient correlation increases systemati cally with increasingly more collinearity.
... View more
06282011
11:23 AM

0

0

45

POST

Working on specifying my GWR model and doing my OLS work found, not surprisingly, that "income", and other variables like "population density", and "call_rate" needed transformation. So I used a normal log transformation and computed new field values for each and created "log_income" and "log_pop_density" and "log_call_rate". Before, running the OLS, everything seemed fine. But now, when running GWR, the analysis craps out with the ole... [INDENT]ERROR 040038: Results cannot be computed because of severe model design problems. Failed to execute (GeographicallyWeightedRegression). [/INDENT] So I narrowed it down to the "log_income" and "log_pop_density" variables  whenever the "log_income" is included, either as independent or dependent variable, the model completely tanks. For the "log_pop_den" variable only when it's a dependent variable does the model tank. Looking at the distribution graphically (attached) for income it seems OK. Perhaps this isn't the correct (best?) transformation though? Methodologically I'm working on an city wide grid, measuring about 2000 x 2000 polygons high and wide, each measuring 250' x 250', where almost every one has data within it. I've set bandwidths to extremes, used minimum # of neighbors, AICc, etc.  every iteration possible  and can't seem to get it to fly. Clearly my "transformation" is the problem but not sure what that problem is. Much thanks for anyone's eyes on this.
... View more
06242011
10:55 PM

0

4

2603

POST

I looked at log transformations on the geostatistical analysis extension. I can see those transformations happenning but I do not know how exactly apply that conversion to the data (i.e. create a new field with the log transformation values so I can use them and reduce bias). may be the blind leading the blind .... but I'll give it a shot: 1) Add a new field, call it something like log_var1 2) I used Field Type DOUBLE, Precision: 18, Scale : 9 3) Right click on the header of the column, select field calculator 4) from the "Functions" menu on right of box click "Log ()"... and it's added to the "log_var1 = " box 5) From the Fields list, above, find the variable you want to transform and double click it to add it. 6) Click on 'OK' to compute the field. You can then right click on the header and select "Statistics" and real quick see how the distribution of the variable changes. Hope that helps. Andrew
... View more
06242011
10:27 PM

0

0

12

POST

Never mind. You have to load the tables in ArcGIS to view them. Maybe I've been up too late working but why is EVERYTHING in ArcGIS,map etc. one more step to difficult than it needs to be (or at least the 'help' files don't tell you simple stuff some days). Well hope this helps some other poor sod like me.
... View more
03112011
02:12 AM

0

0

11

POST

Ugh, I'm running OLS before I run geographic weighted regression  but when I run it, and have selected and specified a path name (without spaces) for the coefficients and diagnostic output files, when I open the resultant dbfs they have the names of variables in them... and that IT. wtf? Any clues (this is one of those I'm doing my phd and having a meeting on Monday so any help would be appreciated). I tried just the path  can't do that, path + name, doesn't seem to matter (Oh, it does generate an XML file as well  with a dbf extension (!?) but I can't get that to do anything either). Oh, and the only way to capture the progress window output is to catch it right before it writes the values to the shapefile  otherwise it just CLOSES and you can't retrieve that (makes me nuts).
... View more
03112011
02:07 AM

0

1

163

Online Status 
Offline

Date Last Visited 
11112020
02:23 AM
