Hello friends. Excuse me if this is a relatively basic question!
I am doing a hot spot analysis for monarch butterfly populations across North America. Essentially, I want to see if there is an overlap between milkweed and butterflies (which there should be haha). I was taught to aggregate the data by creating a fishnet. However, when I did this in class it was a relatively small study area.
Is it even necessary to aggregate the data since it literally spans the continent? Where would I even start with deciding how big grid cells would be? Any help is appreciated!
You're the only one that knows your requirements for the size of "hot spots". Obviously, too coarse and you get blocky, useless maps, and too fine and the processing never finishes. The only way to find out how your environment will perform is to try it.
While Create Fishnet will populate a grid in a specific projection, writing your own polygon generation script has benefits. I've created global grid of 25 sq. km polygons in decimal degrees, using a coarse generalized landform to restrict polygons to within 60nm of land (using a fixed width, and adjusting height to get the polygon size within 0.13% of true area, based on geodesic calculation). I also integrated metadata in each polygon so that there were tessellations (each 25 sqkm cell had a 100 sqkm parent, and each 100 sqkm had a 400km parent, ... up seven levels). This allowed modelling at multiple scales, so that model validation could be quickly performed at 1600 sqkm, then the overnight job could run at 25sqkm. Since you're working with a smaller area, you could use a finer grid (6.25 sqkm and 1.5625 sqkm were possible, though at a significant processing cost, so you might only want to model limited areas [known butterfly "farms"] at that resolution).
Your first task is to try generating a coarse grid, then to work the math to generate a less coarse one with the same alignment. Then you can select out the cells which are entirely over open ocean, and see what you can do with the rest. From there, it's a matter of data quality and processing time.
- V