Kernel Density

Discussion created by snb on Mar 6, 2013
Latest reply on Mar 7, 2013 by snb
The dataset we are working on originally contained around 200,000 positive or negative (binomial) disease sample locations (x,y point data). We then aggregated this data into raster format ??? essentially yielding a single prevalence value for each county that was sampled. We produced a kernel density map based on these data (very rough draft attached) with the goal of producing a visually appealing, smoothed map that revealed areas of high prevalence (with the goal of more efficiently targeting specific regions for disease surveillance). The resulting map is quite interesting, but I am concerned about the underlying distribution of the data. Sampling intensity varied widely over space and time, with some counties having thousands of samples and other counties having only a few samples/or went unsampled altogether.

My primary questions:
Is the appropriate tool and approach to use with this type of dataset? If so, should I attempt to weight sample size in each county as part of the analysis? If not, what analysis would be more suitable given the messy distribution of the data?

Thanks in advance for any insight you may have.