POST
|
Hi Cheryl, Hmmm... small sample size, no obvious clustering when the data is mapped... There may be better solutions, but here is one idea: if you have samples from at least 30 households in a village and have results (positive or negative) for all occupants within each household surveyed, you can calculate a ratio for each household: number of positive cases (positive only) divided by total number of cases (positive + negative) to get the percent positive in each household. If Euclidean Distance (as the crow flies) is a reasonable way to think about the relationships among the households in the village (not reasonable if there are barriers like rivers), you can run Optimized Hot Spot Analysis on the household points using the ratio as your analysis field and see if there is any clustering (that tool only requires 30 points if you have an Analysis Field). If a network, however, is a better representation of the relationships among households (and this could also work if there are bridges from one side of the river to the other connecting households in a village), and you have data for the village transportation network, you can use the Generate Network Spatial Weights tool to create a network representation of the spatial relationships among the households. You would then use the Hot Spot Analysis tool (rather than Optimized Hot Spot Analysis) and the network spatial weights file (set the Conceptualization of Spatial Relationships parameter to: Get Weights From File) to test for clustering. I will also ask a colleague if he has other ideas or suggestions, and I will get back to you if he does (or ask him to please reply direc Very best wishes, Cheryl! Lauren Scott Esri
... View more
02-27-2015
10:23 AM
|
0
|
0
|
15
|
POST
|
Hi Cheryl, If my study area wasn't well defined, I probably wouldn't use Average Nearest Neighbor unless I just wanted to casually compare the average nearest neighbor distances (and wasn't interested in determining statistical significance). And that may be exactly what you want to do! I say this because, unfortunately, the Average Nearest Neighbor statistic z-score and p-value calculation is very sensitive to study area size. Average Nearest Neighbor is a global statistic that tells you about overall clustering. Getis-Ord Gi* is a local statistic that shows you where clusters are. It would not be unusual for a global statistic to say there is no clustering but for a local statistic to find local statistically significant clusters. So even if a global statistic tells you there is no clustering, that doesn't mean you shouldn't bother with a local statistic. But I'm not sure Getis-Ord Gi* will actually be very helpful to you. You indicate the positive cases are very rare... if you map the positive cases, can you tell just looking at the map if there is clustering? I say that because for Getis-Ord Gi*, you really want a good range of values... if almost all of your households have zero cases and a couple have one or two cases, that really isn't enough variation in the analysis value to be appropriate for this tool. If there are at least 30 neighborhoods in the village, aggregating the counts and creating ratios (positive to negative) for each of those neighborhoods might provide enough variation ?? Another issue that will be problematic, though, is appropriately modeling spatial interaction among the households. You will want a model that reflects that there is no (or less) interaction across the river than on the same side of the river. Finally, keep in mind what this analysis is saying... the expectation (null hypothesis) is that every positive case could be dropped down hither tither onto the households in a random manner. Consequently, finding statistically significant hot or cold spots suggests other processes (beyond random chance) may be at work... Still, all we can really do in this case is reject that null hypothesis. Would rejecting the null hypothesis be useful (would someone get excited to learn that who gets the disease is not completely random)? If we believe getting the disease is entirely the result of bad luck (no contagious property, no spreading vector, no genetic components) at all, then yes, finding statistically significant clustering might be interesting. Similarly, if you have no idea what the factors promoting the disease are, then WHERE the clusters occur might show you where to start looking for answers. But keep in mind that Getis-Ord works by comparing the local mean (average positive cases for a household and its neighbors) to the global mean (average cases for all households). The tool then determines if the local mean is significantly different from the global mean. If you have a sea of households with zeros, any deviation from zero will be statistically significant and just mapping the positive cases will probably show you where to look for factors that may be promoting the disease. I guess this is what I would do: 1) If mapping the positive cases shows clear clustering, I would go with my map. Done. 2) If I had a good range of values for the household ratio of people with and without the disease, and if I had more than 30 households on each side of the river with positive cases, and if it was tricky to see if the positive cases were clustered or not... I would run hot spot analysis on the ratios for each side of the river separately using the exact same distance band for both analyses... Then I could make comments about the clustering overall and also could compare the clustering on each side of the river. I hope this helps a little bit! Very best wishes, Lauren Scott Esri
... View more
02-09-2015
04:52 PM
|
2
|
2
|
15
|
POST
|
Hi Andrew, You can definitely have some fun analyzing your crime data You will have to convert the categorical data to counts, or proportions, but you can still learn a lot about how the different types of crimes relate to each other. Here are some ideas: 1) Use Optimized Hot Spot Analysis. It will overlay your study area with a fishnet grid and count the number of crimes (of all types) that fall within each grid cell, then it will perform hot spot analysis to show you the hot and cold spot areas. This answers the question: where are the hot and cold spots of (all) crime? However, you can also run Optimized Hot Spot Analysis on different types of crime, then visually compare the hot spot maps. 2) You can use the fishnet grid cells from (1) or else use census tracts and count the number of unique crime types within each polygon to see where you have the highest crime diversity (you might find that some places only have robberies, for example, while other places experience a wide variety of different crime types)... you can run hot spot analysis on the diversity counts. 3) For fishnet grid cells (output from Optimized Hot Spot Analysis, then use Spatial Join) or census tracts, count the number of assaults, the number of robberies, the number of auto thefts, etc. within each polygon, then convert those counts to a percentage of all crime. You can then run grouping analysis to find polygons with similar challenges ... one group might be high for assault and narcotics, for example, but low for robbery... knowing the profiles -- the specific challenges -- of each group can help you identify effective prevention strategies. 4) If you have distinct clusters of particular crimes, you can create standard deviational ellipses around each cluster then overlay the ellipses for two different types of crimes to see how spatially integrated they are. (I'm doing an analysis right now that looks at violent crime in relation to alcohol establishments... to see how integrated those two "activity" spaces are). 5) Using the output from (1) you can find spatial outliers for all crime or for specific crime types: a high crime count area surrounded by low crime count areas, or a low crime count area surrounded by high crime count areas. These anomalies are often very interesting (what is that one neighborhood doing right... it is has no problem at all with narcotics while surrounding areas are high for drug related crimes? ... or why is this one neighborhood so high in relation to surrounding neighborhoods?) 6) Be sure to check out the new space time pattern mining tools in the 10.3 release as well: An overview of the Space Time Pattern Mining toolbox—ArcGIS Help | ArcGIS for Professionals I'm sure others will have ideas as well. I hope this is helpful! Best wishes, Lauren Scott Esri
... View more
02-09-2015
02:12 PM
|
2
|
0
|
9
|
POST
|
Hi Alex, Sorry you are having problems with the tool! What version of the software are you using? (ArcGIS 10.2 or ?). You should not see that error unless you truly have all the same values so that's odd. Is there anyway that you can send me your data so that I can try to reproduce the problem (you can remove all fields except the one that is giving problems)? If so, please email me directly at LScott@esri.com . Again, sorry this isn't working for you! Lauren
... View more
08-25-2014
12:58 PM
|
1
|
0
|
10
|
POST
|
Please see if this works for you: ArcGIS Help Space Time Cluster Analysis) Best wishes, Lauren
... View more
08-19-2014
03:36 PM
|
0
|
0
|
25
|
POST
|
Our new tools for 10.3 are Create Space Time Cube and Emerging Hot Spot Analysis... they are beta in ArcGIS Pro and we are going to do our very, very best to get them into the 10.3 ArcMap release as well. For future releases we are planning to develop additional tools (probably Outlier Analysis next) to work on the cube (netCDF) data structure. Note: the additional tools work is not yet started, so it technically does not fall into the "concrete plans" category... but that's what we are thinking. Also definitely not yet in the "concrete plans" category: we are very interested in predictive event analysis and are starting to investigate. Thanks for your interest in our work! Lauren
... View more
08-14-2014
10:49 AM
|
0
|
1
|
25
|
POST
|
Great to know that the sample script is working on 10.2.2! It looks like your zip file only has the toolbox though (instead of both the script and the toolbox). If Ori is using your zip file, that is definitely the reason he is getting an error about not finding the script. If Ori is using the zip file I attached, then Phillip's suggestion to make sure the script path is correct is a good one! Here's how: once you navigate to the Temporal Collect Events toolbox, right click on the tool and select "Properties"... then on the "Source" tab, check to make sure the path to the temporalcollectevent.py script file is correct... if it isn't, browse to the .py file to set the correct path. Ori: if you still have problems, please send the screen shot to my email, LScott@esri.com , and I will see if I can figure out what's what. Thanks! Lauren
... View more
08-14-2014
10:31 AM
|
0
|
0
|
25
|
POST
|
Just curious... what version of Desktop ArcGIS are you using? We will be slammed until the 10.3 release is done, but if we can, we'll try to get the temporal collect events sample script working for 10.2.2. Thanks! Lauren
... View more
08-13-2014
11:45 AM
|
0
|
0
|
31
|
POST
|
Well I am still getting familiar with GeoNet too, but this is what I just tried (and it worked for me): 1) Click on the attachment 2) A pop up is displayed and one of the options is to save. 3) You should see a little blue down arrow when you say okay. If you click on it, you can get the zip file. If that doesn't work for you I will find someone who can tell us a better solution. Here is some more information about the Emerging Hot Spot Analysis categories: *************** Category Definitions ******************** Hot spots that are statistically significant for the last time step interval: New: only the most recent time step interval is hot Persistent: at least 90% of the time step intervals are hot, with no trend up or down Intensifying: at least 90% of the time step intervals are hot, and becoming hotter over time Diminishing: at least 90% of the time step intervals are hot, and becoming less hot over time Consecutive: an uninterrupted run of hot time step intervals, comprised of less than 90% of all intervals Sporadic: some of the time step intervals are hot Oscillating: some of the time step intervals are hot, some are cold Hot spots that are not statistically significant for the last time step interval: Historic: at least 90% of the time step intervals are hot, but the most recent time step interval is not Cold spots that are statistically significant for the last time step interval: New: only the most recent time step interval is cold Persistent: at least 90% of the time step intervals are cold, with no trend up or down Intensifying: at least 90% of the time step intervals are cold, and becoming colder over time Diminishing: at least 90% of the time step intervals are cold, and becoming less cold over time Consecutive: an uninterrupted run of cold time step intervals, comprised of less than 90% of all intervals Sporadic: some of the time step intervals are cold Oscillating: some of the time step intervals are cold, some are hot Cold spots that are not statistically significant for the last time step interval: Historic:at least 90% of the time step intervals are cold, but the most recent time step interval is not We won't be able to get it done for the first release, unfortunately, but in a future release you will be able to select the categories you are interested in and also modify how each category is defined (i.e., right now a persistent hot spot is one where 90% of the time step intervals are statistically significant hot spots and there are no statistically significant cold spots... you might want to change it to 80%, for example). Beta 5 will have better cell size and time interval defaults when you don't provide anything for those parameters, and messages defining categories. I hope this helps. Best wishes, Lauren
... View more
08-13-2014
09:04 AM
|
1
|
11
|
31
|
POST
|
Hi Philip, I'm attaching a zip file with the sample tool... it is pretty rough and has only been tested with 10.1 sp1. This question comes up often, so please allow me to answer this question more broadly here: You want to run a space time hot spot analysis on your event data (crime, disease, traffic accidents) where you don't have an attribute/field to analyze. In other words you just want to know where in space and time you have a statistically large number of events. Some people have guessed that they need to use the Collect Events tool. Unfortunately when Collect Events combines points that are near in space, you lose the temporal component of your data (it will combine two points that are near each other even if they have date stamps that are very far apart). What we need here is a tool that aggregates based on space AND time. You want to be able to set a distance threshold (500 meters, for example) and a time threshold (something like 5 days) and have the tool aggregate only those points that are within both those thresholds. The attached sample script will do that. Alternative approaches if the attached script doesn't work for you: 1) Create a model tool that iterates through time and selects only those features/events that meet your time requirement... run collect events on the selection set... use the Add Field tool to give the result a DATE field and Calc the value to be a date within the time period selected (like if you want to combine events within two days, you would want to calc the new date field to either the first or second day date for each record output from Collect Events). Then merge all the results into a single file... then create the spatial weights matrix for the merged data using Generate Spatial Weight Matrix, then run hot spot analysis. 2) If you know python, you can try to debug the attached to make it work for whatever version of ArcGIS you are using. 3) You can sign up for the ArcGIS 10.3 beta program and use the space-time pattern mining tool in ArcGIS 10.3 ?? The new Emerging Hot Spot Analysis tool allows you to run space-time hot spot analysis on event data. I hope this helps! Best wishes, Lauren
... View more
08-12-2014
04:38 PM
|
1
|
15
|
31
|
Online Status |
Offline
|
Date Last Visited |
11-11-2020
02:23 AM
|