Latest Contributions by EricKrause

Hi @MatthewPoppleton, Yes, that equation from the ArcGIS 9 documentation is the formula for the K-Bessel semivariogram that is used for all instances of K-Bessel in Geostatistical Analyst. As you said, the detrended version performs a first-order trend removal, then applies the K-Bessel formula to the detrended values. Also, the K-Bessel semivariogram is often called the "Matern" semivariogram in other geostatistical literature. You might be able to find more information searching for that keyword instead. -Eric

‎07-05-2024

Hi @MWmep013, The reason for the discrepancy is that the GWR tool was reimplemented in ArcGIS Pro 2.3, and the previous version (equivalent to ArcMap) was deprecated. Among other things, the newer version uses a different and more common formula for global and local R-squared and optimizes bandwidths differently. The newer version follows the design and formulas of the GWR4 software (not from Esri). While you will not find it in the Geoprocessing pane, the deprecated version can still be used through arcpy (for example, in a Python Notebook or the Python Window) with arcpy.stats.GeographicallyWeightedRegression(). You can see the documentation for the deprecated version here: https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-statistics/geographically-weighted-regression.htm Using the deprecated version should provide the same results as the ArcMap version. As for why using Distance Band lowers the R-squared, I am not certain, but it likely has something to do with your particular data. Please let me know if you have any other questions. -Eric

‎06-24-2024

Hi @ZhichengZhong, ArcGIS does not have a GTWR tool, but in principle, including time should not alter the question of if/how to test for significant explanatory variables. While the GWR tools does provide significance results for local models, it's understandable that other softwares and publications do not. The problem is that GW(T)R isn't really a single model: it is a collection of local models that are each estimated at the locations of the input features. Further, these models are correlated with each other, as they often share the same features in their neighborhood. This can create problems related to multiple hypothesis testing where you are effectively testing N times the number of explanatory variables, and you should definitely be cautious in interpreting any particular p-value when performing so many hypothesis tests. This is why, generally, explanatory variables are chosen using global models like OLS, and statistics like R-squared and AIC are used to determine how much better GW(T)R does compared to the global model.

‎06-12-2024

Hi @af2k24, I don't believe there is any way to do this. If I'm understanding correctly, you're wanting to use the coefficients from the original model and use them for the new rasters. However, EBK Regression Prediction will rebuild the coefficients for any new rasters, estimated from the input features that you provide; you can't save the coefficients are reuse them, unfortunately. The only thing that comes to mind is trying to forecast the input point values to 2070-2099 and use them along with the forecasted rasters to get a prediction surface for 2070-2099. Though that is obviously easier said than done, especially if you do not have historical point data to build a forecast model. -Eric

‎06-05-2024

Hi @EToon, Unfortunately, as you found, this is not going to work. The iterators in ModelBuilder are designed to work with Feature Class and Field type parameters, but the Input Datasets in Create Geostatistical Layer tool is a custom parameter type (called a Geostatistical Value Table), where the input dataset(s) and the field(s) are contained in a single parameter. This is because different model sources require different fields. For your case, only a dataset and field are required, but if you had performed cokriging with two datasets, for example, you would need to provide two feature classes and two fields. Other model sources would require other combinations of fields. Do your datasets and fields happen to have consistent names? Something like data1, data2, data3, etc? If so, this should be relatively simple to do in Python (I can help with this). But if they all have completely different names, you would need to type each out individually, which probably would not save much time from just doing it manually. Sorry for the bad news, but I don't know any way around this. -Eric

‎05-23-2024

This has been included in the product plan for ArcGIS Pro 3.4. The reimplementation will be a geoprocessing tool that creates a customized scatter plot chart on a feature layer that displays the projected scatterplot and trend line of the XZ plane. In ArcGIS Pro 3.3, you can create this scatter plot manually with customized Arcade code using these steps: On a feature layer, create a scatter plot chart by right-clicking the layer -> Create Chart -> Scatter Plot. In the Chart Properties pane, for the "Y-axis Number", provide the analysis field. In the "X-axis Number", click the "Set an expression" button to the right of the pulldown menu. Paste the Arcade code at the end of this post into the "Expression" code block (make sure that the "Language" at the top is set to "Arcade"). Change the second line of the code to any desired direction. The direction is provided as degrees clockwise from North. For example, 0 is north, 90 is east, 180 is south, and 270 is west. Click OK. The directional trend scatter plot will be displayed in the Chart pane. You can click "Set an expression" button again and change the direction, and the scatter plot will update to show the trend in the new direction. To show the polynomial trend line in the scatter plot, check the "Show trend line" checkbox in the Chart Properties pane, choose "Polynomial" from the dropdown, and provide a desired "Trend Order". // Input direction as clockwise degrees from north var angleFromNorth = 0; // Convert direction to counterclockwise radians from east var adjustedAngleDegrees = 90 - angleFromNorth; var adjustedAngleDegrees = adjustedAngleDegrees%360; var angleInRadians = adjustedAngleDegrees * PI / 180; // Return x-coordinate of rotated coordinate system return Centroid($feature).X * Cos(angleInRadians) + Centroid($feature).Y * Sin(angleInRadians)

‎04-30-2024

Hi @NakkyEkeanyanwu, I think the major confusion is that Dimension Reduction is not selecting a subset of the variables that you provide. Instead, it uses all variables to construct new "components" and each component is a weighted sum of all the variables. As a very simple example, let's say you have four variables (A, B, C, and D) and you want to create one component (reducing the dimension from four to one), the component might looks something like this (I am making up these coefficients): Component = 0.7*A + 0.2*B + 0.6*C - 0.1*D In essence, the component uses all variables, and the weights (the coefficients) indicate how "important" that particular variable is in the component. These coefficients are the eigenvector of the component, and the associated eigenvalue indicates how much of the total variability of the four variables is captured in the component. Frequently, a large percent of the total variability of all variables can be captured in just a few components, and this is what drives things like the Broken stick and Bartlett's test methods. They try to find a compromise between minimizing the number of components and maximizing the amount of variability that is captured by the components. Determining how many components to create is the most difficult part of Principal Component Analysis, so various methodologies are performed to help you decide. In an ideal case, you see some components account for a large percent of variance (PCTVAR field), then a sudden drop in the percent. However, for your data, I don't really see this; the variability captured by each component seems to drop steadily, and I think this is why Bartlett's method is recommending using a large number of components. However, using 7 components certainly seems justifiable here as well. Really, you could justify any number between 3 and 28. Regarding only 28 components explaining 100% of the variance, this means that two of the variables you provided are redundant, that their information is fully accounted for by other variables. If I'm reading your screenshots correctly, you use total population as a variable, and you also use the populations of particular subgroups. If the populations of the subgroups add up to the total population (or very close to it), then there is redundancy since the total population is captured by the sum of the populations of the subgroups. I suspect this is happening for two variables, resulting in 28 components that account for all variability. Please let me know if you have any other questions.

‎04-10-2024

Thank you for the recommendation. We will add this to the documentation. The reason the tool does not refer to them as "fixed" and "adaptive" within the tool is that these are both general paradigms rather than specific neighborhood types. Using a number of neighbors is just one kind of adaptive neighborhood, and a fixed distance is one kind of fixed neighborhood. If it just said "Adaptive" in the tool, you would then need to ask why kind of adaptive neighborhood it is, and it is specifically a number of neighbors neighborhood. Similarly with fixed distance bands.

‎04-10-2024

GWR is a relatively recent tool (there is also an older version that is now deprecated), so it creates a Source ID field on the output features rather than require and input Unique ID field.

‎04-10-2024

Hi @geolane93_KU, Without seeing the data and having a better understanding of the purpose, it's difficult to give concrete recommendations. However, I do have a few thoughts that might help. First, if you have ArcGIS Pro 3.0 or later, look into the Compare Geostatistical Layers tool. You can create various different EBK3D outputs and compare their cross validation statistics to see which are more accurate than others. Then can help choosing a subset size, transformations, and semivariogram models. Second, a subset size of 20 sounds quite small to me, particular for the K-Bessel semivariogram. My experience is that you should use at least 50 points in each subset for a semivariogram model with so many parameters (and, usually, more than 100 is better). Third, I would consider removing some of the surface points that may be playing too dominant of a role in the model. The problem is alleviated somewhat by using sectored neighborhoods, but the comparatively dense sampling at the surface is likely still negatively impacting subsurface predictions. In particular, I suspect that the estimated Elevation Inflation Factor (EIF) is being most affected here, and the EIF is an extremely important parameter for accurate results. Fourth, if the jagged edges and artifacts are far away from the input points (like in the top or bottom corner of the 3D extent), then I would not worry too much about them. EBK (2D and 3D) often produces these kinds of artifacts when you extrapolate (predicting outside the input points), but it tends to be very stable when interpolating (predicting between the input points).

‎04-02-2024

Hi @JamalNUMAN, "Number of Neighbors" is an adaptive bandwidth because the distance used at a location depends on the distance to the last neighbor, so it will vary ("adapt") depending on the location. I believe you are looking at the documentation for an older and deprecated version of GWR. Please find the documentation for the new version here: https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-statistics/geographicallyweightedregression.htm

‎04-01-2024

Hi @JamalNUMAN, Requiring a Unique ID field is an older design pattern that is not used in more recent tools. In fact, the Generalized Linear Regression tool (with Gaussian model type) does the same thing as the OLS tool, and it does not require a Unique ID field. The idea behind the Unique ID field is that it gets copied to the output features, so you can join the output results back to the input (or vice versa). For example, if you have a selection, the output features will not have the same Object IDs as the input, so some other field needs to be used to match input/output. In more recent tools, each Object ID from the input is copied to a "Source ID" field of the output features. This serves the same purpose (being able to match output to input) but does not require that you provide a field.

‎02-14-2024

Hi @DOEEYANG, Can you clarify how you are performing kriging? I'm guessing the Kriging tool in the Spatial Analyst toolbox, but there are a few different versions. Without looking at your data, my guess is that these areas with no predictions are outside the neighborhood of your input points. Assuming you're using the tool above, check the "Search radius" parameter. If you are using a "Variable" neighborhood, check whether there is a "Maximum distance" value. If using a "Fixed" neighborhood, check the "Distance" value. If your cells with no predictions are further than this distance from any input point, the value cannot be interpolated. Using a sufficiently large distance should allow you to interpolate everywhere in your study area. Please let me know if this does not resolve the problem or if you're using any of the kriging methods in Geostatistical Analyst.

‎02-14-2024

GWR will not use the z-coordinate in any capacity. So if you have multiple points at the same (x, y) but different z, GWR will treat them as being at the same location. Splitting your dataset by floor and independently performing GWR is the only solution that immediately comes to mind. The problem of constant values of the explanatory/dependent variable is more difficult, as GWR will return an error if any neighborhood contains a constant value of any explanatory variable or the dependent variable. To calculate GWR results, you'll need to use neighborhoods large enough to ensure this never happens. However, if the neighborhoods are very large, GWR effectively turns into OLS. Hopefully there is some range of neighborhood that can estimate local effects but still never encounter neighborhoods with constant values.

‎02-14-2024

Hi @JamalNUMAN, I don't think it does any data splitting for the statistics in your images. Data splitting is not required in order to compute them, and in my experience, OLS, GWR, and other variants of the general linear model do not perform data exclusion to calculate them. In recent years, I've seen GWR used with data splitting (to make it more in line with machine learning workflows), but I do not think the GWR tool does this. Also, I'd suggest that you ask your GWR questions (and any other questions about the Spatial Statistics toolbox) in the Spatial Statistics Place. I know a lot about GWR as a theory, but I'm less knowledgeable about the specifics of the implementation of the GWR tool. For example, I do not know why those three statistics are calculated, but others (like MAPE) are not.

Online Status	Offline
Date Last Visited	yesterday

My Ideas

Latest Contributions by EricKrause

Re: K-Bessel Semivariogram Equation

Re: Geographically Weighted Regression difference in Desktop and Pro version

Re: Geographically and Temporally Regression (GTWR), the p-value of Local regression coefficients

Re: Using calibrated EBK model for forward modeling (no input features)

Re: Create Geostatistical Layer tool and Iterators in Model Builder

Re: Add Trend Analysis Tool in ArcGIS Pro - Status changed to: In Product Plan

Re: Dimension Reduction Tool - First Time Using

Re: ArcGIS Pro 3.0.2: Is setting the “neighborhood type” to be “number of neighbors” considered fixed or adaptive bandwidth?

Re: ArcGIS Pro 3.0.2: What is the function of the “unique ID field” while working with OLS?

Re: Choosing "Correct" EBK 3D Parameters

Re: ArcGIS Pro 3.0.2: Is setting the “neighborhood type” to be “number of neighbors” considered fixed or adaptive bandwidth?

Re: ArcGIS Pro 3.0.2: What is the function of the “unique ID field” while working with OLS?

Re: Non-interpolated area by Kriging Interpolation

Re: ArcGIS Pro 3.0.2: The GWR tool encounters a "coincident features" error due to exceeding the "minimum number of neighbors (30).&quo

Re: ArcGIS Pro 3.0.2: Which data splitting method is used in the GWR when calculating the statistical metrics for performance evaluation?

Re: K-Bessel Semivariogram Equation

Re: Kriging Model Types

Re: Geographically Weighted Regression difference ...

Re: Geographically and Temporally Regression (GTWR...

Re: ArcGIS Pro 3.0.2: What is the function of the ...

R-ArcGIS