Hello,
A kernel density routine developed for the "R" language has been popular for use by government agencies to estimate core home range for Bighorn Sheep using telemetry points as input data. Mostly out of curiosity, I developed a routine using the most recent ArcGIS kernel density routine in Spatial Analyst. It produces very similar results using the HREF bandwidth. I asked the developer of the R home range routine what he thought and this was his answer:
This is his reasoning for using only the R routine and kernel density function and not using the ArcGIS python routine that I developed:
"The main issue is that the Home Range Arc tool uses a quartic kernel density function which is only an approximation to the Gaussian density function employed by the R home range estimator. ArcGIS uses it because it's fast and good enough for visualizations for every day users, but it's not a density function that you actually see in the home range literature. It's also not the density function used when we were estimating the foray frequencies and distances, which is in itself a reason not to go with it."
So, my question to you is "do you agree that ArcGIS is not capable of generating a home range polygon from telemetry points?". My reasoning is that it is because coming up with a representative polygon is an art and not an exact science. Sometimes the home range polygon is even hand delineated using expert knowledge. Also, the ArcGIS kernel density function was improved with version 10.3.
When using the HREF bandwidth what I have observed is that the R-script performance is faster on small datasets of telemetry points but the ArcGIS python routine is faster on larger datasets. The resulting polygon is very close.
Your thoughts regarding this will be appreciated. I have the routine working with both desktop and PRO.
Thanks,
Dan
everyone has an opinion.
you can roll out your own if you are concerned since you have access to numpy and scipy in arcmap/ArcGIS pro
scipy.stats.gaussian_kde — SciPy v1.1.0 Reference Guide
some of the same literature is referenced and you could do a comparison if you wanted
Dan,
Thanks for the good information. I will look into these new
capabilities.
you can try their data for starters... just to give you an idea how to set it up with some 'fake' parameters
from scipy import stats
def measure(n):
"Measurement model, return two coupled measurements."
m1 = np.random.normal(size=n)
m2 = np.random.normal(scale=0.5, size=n)
return m1+m2, m1-m2
n = 5
m1, m = measure(n)
m1, m2 = measure(n)
xmin = m1.min()
xmax = m1.max()
ymin = m2.min()
ymax = m2.max()
X, Y = np.mgrid[xmin:xmax:100j, ymin:ymax:100j]
positions = np.vstack([X.ravel(), Y.ravel()])
values = np.vstack([m1, m2])
kernel1 = stats.gaussian_kde(values, 'scott')
kernel2 = stats.gaussian_kde(values, 'silverman')
Z1 = np.reshape(kernel1(positions).T, X.shape)
Z2 = np.reshape(kernel2(positions).T, X.shape)
np.allclose(Z1, Z2)
True # result
Hi Dan P.,
Thanks for your examples.
I have an ArcGIS raster to numpy array and back to ArcGIS raster working via the following code:
quick look for now
inRas must be converted to a float because it is text right now
/blogs/dan_patterson/2016/08/14/script-formatting Dan can you reformat the code so it has line numbers etc.
Parameters will be text using GetParameterAsText and if they are supposed to be int or float, they need to be converted appropriately
Hi Dan,
I am sorry for the confusion. That first script using inRas has nothing to do with my problem. It works well. The arguments are only used to locate the raster and the output GDB to save to. I was just giving you some feedback that I was successful in being able to go from a geodatabase raster to a numpy array and then back to an identical geodatabase raster per my question in another post. I.E. that first script works well for me solving a problem that I asked you about in another post below this one (creating a raster with coordinates).
The 2nd part of the post above is the problem. I'm trying to pass a known bandwidth to the scipy gaussian kernel density routine rather than using one of the other options. I do this with the ArcGIS kernel density routine by specifying a calculated searchRadius. However, the gaussian routine comes up with an error on the bandwidth option when I use:
kernel = stats.gaussian_kde(my_array, bw_method=searchRadius)
Search radius is a calculated floating point number using the reference bandwidth equation. It could be that I am fooled by the bandwidth parameter error message and that the array I am passing is messing it up. The array is generated from a telemetry point feature class or shape file (a selection of points for a particular animal) and the following statement seems to work to successfully create the numpy array:
my_array = arcpy.da.FeatureClassToNumPyArray (Location + "\\\\" + "Z" + str(animal) + "pts", ["OID@", "SHAPE@XY"])
But, then when passing the array and search radius to the gaussian kde that statement fails saying there is a problem with the bandwidth parameter (bw_estimator).
There is no hurry on my end and you don't need to worry about this - just thought I would throw out the problem in case anyone knew a quick answer.
Dan
Ok... that array is not of the form that is required by the kernel density estimate.
The current array is called a 'structured array' and since it has the id and the X,Y coordinates, there is a little trick to extract the x,y coordinates so that you get an 'ndarray' which is the floating point values of the array without the field names.
I have several little functions that make it easier, and I can pop them into a documentation example if you zip a shapefile with sample of the data so I can document the steps to make it easier.
Hi Dan P.,
The data is sensitive so I cannot share it. It is just a regular point feature class with some attributes. But, you have me on the right track now. I will double check and use a non scaler (default without bandwidth) and it should flag the array as the error or even still the bandwidth even though only one parameter. Then I will look into array types or perhaps make an numpy array from my list of x,y points that I use to get the reference value. So, it must think the 3rd value in the existing array is band width parameter. I just need to put a little more effort into finding a solution.
Thanks,
Dan
Ok... I will send some fake point data so you can see how it is extracted