Getis-Ord G* - Hot Spot Analysis Question - Fisheries

Discussion created by thomas.nies_noaa on Jan 10, 2013
I am working with a team that is analyzing scientific research trawl survey data for clusters of large fish. The survey sampling design is a random stratified design.  Survey strata are defined primarily on depth contours. The strata are not of the same area. Multiple tows are conducted in each stratum (usually, but not always, the same number of tows are in each strata). I have attached a figure (Survey_Strata_Stations.jpeg) with the strata and tow locations from several years and seasons pooled together; you can see that generally the tow density is higher in shallow water and lower in deeper water. The numbers of fish at size can be determined for each tow (either numbers or weight). The data are not normally distributed but this can be improved with a log transformation. The underlying hypothesis is that large fish may be spatially correlated during the time of spawning (roughly the same as the time of the survey), which may help us identify spawning areas.

We are considering using the hot spot cluster analysis (G*), run on the individual tows in the survey area (i.e. not specific to each stratum). Because the strata are of different sizes, the density of tows (i.e. the number of tows per area) varies over the survey range. Does this matter? We are concerned that the fact the sampling/survey tows are not uniformly distributed may bias the clustering results. Put glibly - will we only find clusters where there are clusters of samples? We have discussed addressing this through some kind of spatial weighting scheme but (1) aren't sure that is necessary and (2) don't have a clear idea what sort of scheme might work.

This does not appear to be addressed in the Arcview tutorial example - the samples in that example are not uniformly distributed, either.

Thanks for any advice you can provide.

T Niessen
Fishery Analyst