I am working in ArcGIS Desktop 10.4 (advanced license), and also have ArcGIS Pro 1.4 available to me for visualizing in 3D.
My goal is to take disease data for a large dataset of points from 10 different time periods, append them together, and input these points into the Create Space Time Cube tool so I can run Emerging Hotspot Analysis afterward. Information on the disease dataset:
Originally 10 different point features, all with identical attribute tables. The table has 3 fields, a unique ID, a disease flag (1 or 0), and a date field. I do this for all ten, and then I use the Append tool to put them together. I end up having 1 big appended point feature with about 2 million points (~ 200,000 in each time period). Several of the points are coincident in the appended feature but have different dates, which is what I need for looking at disease hotspots over time.
When I input this appended point feature into the Create Space Time Cube tool,
I choose the disease I want to analyze over time as the analysis field, and then the space time cube asks me what kind of summary statistic I would like to use on it as you can see in the image. If I specify the Neighborhood Distance of the bin to 1 mile and choose to sum the disease field, it's going to count all the 1's in the disease field. If they're all 1's and 0's...that's not prevalence. It's just a count of the 1's. If I choose mean, that might work...it'd take the average of 1's and 0's per bin, and look at it over time. But I'd like more control of what goes into the Space Time Cube and I'm not entirely sure what the mean is doing to all of the points being binned.
When I am doing a point-level hotspot analysis for 1 point in time (as in not for the Space Time Cube) I run Integrate with an XY Tolerance of 1 mile, and Collect Events on the points first where disease = 1, then clear my query and run Collect Events again on the total count of points. Spatial join the selection (where disease = 1, with Min summary statistic checked) to the total points Collect Event results. take the resulting spatially joined feature, add a new field called Prevalence, and divide the selection where disease = 1 by the total count. I put this spatially joined point feature into the Optimized Hotspot Analysis tool, choose the prevalence rate as the analysis field, and the result of that tool gives me point-level hotspots. I want to do some sort of equivalent of the point-level prevalence rate to input into the space time cube.
What I am looking for help on: If I were to choose a 1-mile Integrate XY tolerance, and follow the same steps I'd normally do on regular hotspots (as mentioned in the previous paragraph), when going through the Create Space Time Cube if I were to choose a 1-mile bin size, how could I ensure only 1 point (per time period) w/ an associated prevalence rate would go into 1 bin, and thus preserve my rates going into the Space Time Cube so it's as close as possible to my normal Optimized Hotspot Analysis approach?