Hi,
I am a little bit confused about the define global window parameter.
My understanding is if I choose the entire cube, the Emerging Hot Spot Analysis tool starts comparing from the first day, and the Individual time step chooses the last day of my data to compare with.
The neighborhood time step (NTS) compares the phases between (kinda middle or whenever I define).
My data has daily values for 45 days and NTS is set to 1 day. So, are results expected to be the same as the Ind. time step result?
Hi,
If you choose Entire Cube, the tool conceptually compares the mean of each bin in the space time cube to the mean for all bins in the space time cube and determines if the means are significantly different (it's a bit more complicated than that, taking the number of features and variance into account, not just the mean).
Check out this video, especially beginning at 5:37. https://www.youtube.com/watch?v=9VDRYBvOoDI&list=PLGZUzt4E4O2LuV0vuH74WN6j9nxv0jUty&index=5
That video is part of a learning path about space time analysis, in case it's useful: https://learn.arcgis.com/en/paths/spatio-temporal-analysis-of-covid-19-daily-confirmed-cases/
Best wishes,
Lauren
Hi Dr. Griffin--Thank you for the illustrative explanation of emerging hotspot analysis in your video. If you don't mind, I would like to make sure I am understanding the differences between the options to choose for the "Define Global Window" parameter. I understand the "entire cube" option compares the local mean to the global mean (which is calculated for the entire cube). However, the remaining two options ("neighborhood timestep" and "individual timestep") are still a bit unclear to me, so I'd like to confirm if I am understanding this correctly. Is the primary difference that the global mean for the "neighborhood timestep" is calculated from TWO timesteps (current + previous), whereas the global mean for the "individual timestep" is calculated from only ONE timestep (current)? Thank you! ~Ashley
Almost exactly right! Humor me for a minute, I'm going to backtrack just a bit. Hot Spot Analysis and Emerging Hot Spot Analysis both run the Gi* Statistic under the hood. Conceptually, the Gi* statistic works by computing the mean value for a feature and its neighbors, and comparing that local mean to the global mean (the mean value for ALL features in the dataset). Then, taking the number of features and variance into account, it decides if the local mean is different enough from the global mean to be a statistically significant hot or cold spot.
For Emerging Hot Spot Analysis, neighbors can be defined to include temporal AND spatial neighbors, as you know. Let's consider an easy one: you define neighbors to be the 8 closest spatial neighbors and 1 time step. The local neighborhood would be 9 features from the same time step (the feature itself plus the 8 nearest neighboring features) and 9 features for the preceding time step, for a total of 18 neighbors. If you choose 2 for the Neighborhood Time Step parameter, there will be a total of 27 features in the neighborhood for each feature in the dataset. Note: the Gi* analysis is performed for every feature at every time step.
Now on to your question:
If you choose Entire Cube to decide if the local feature and its neighbors at a particular location in the cube is a statistically significant hot or cold spot, the tool will compare the local mean to the mean for all features in the cube. You did understand that one well! 🙂 If you choose Neighborhood Time Step (let's assume you typed 2 for Neighborhood Time Step), the tool will compare a feature and its 27 neighbors to the mean for: all features in the same time step along with all features in the preceding 2 time steps. This is a good option if you have strong trends in the data (like with Covid, fewer cases at the beginning, more at the end of the time period you're analyzing). So that's the one you didn't quite get right. The number of time steps used to compute the Neighborhood ("global") mean will match whatever you typed for the Neighborhood Time Step parameter (it wouldn't necessarily be TWO). Last is the Individual Time Step. This option is like taking a snap shot at each time step to compute the individual ("global") mean. The mean for each feature and its 27 neighbors (assuming you typed 2 for the Neighborhood Time Step parameter) will be compared to the mean for all features in the same time step. There is only one time step used to compute the "global" mean and it's based on all features in the current time step (the time step matching the feature being analyzed). This is often a good option if you are making decisions about how to react today (assuming the final time step is today) to address the emerging trends.
I hope this is clearer. If not, please ask again.
Best wishes,
Lauren