Hello, I am doing a project looking at the prevalence of a weeds in an area. As an example, I have the following layers:
*Weeds Layer (polygons) {categorical data} Each polygon represents a point or polygon that corresponds to a certain type of weed. There are 16 weed types.
*Plant Zone Layer (polygons) {categorical} Each polygon corresponds to 1 of 7 categories.
*Rainfall {polygons} {Ordinal Numerical categories} Each polygon corresponds to one of 5 ranges, like 0-5 inches, 5-10 inches etc
*Average vegetation height (polygons) {Numerical} Polygons representing the average height of vegetation
-- These layers are all projected and fit nicely on top of each other in my workspace. I want to answer questions like, "Where does weed X occur most often? (Dry areas?, in plant zone 4?, where the vegetation is high?)
So, this sounds like something suited for correlation and regression in something like R, but things are only spatially related. I'm new to GIS and any help greatly appreciated. I'll clarify my goals if it helps.
This is not a problem well suited to correlation/regression analysis. Your desired outcome, in statistics, is referred to as contingency analysis (http://en.wikipedia.org/wiki/Contingency_table). From the description of your data, you can generate the data for a contingency analysis using a series of spatial joins (ArcToolbox > Analysis Tools > Overlay > Spatial Join) and then calculating the associated frequencies. The de facto statistical test on a contingency table is a Chi Square.
This is not a problem well suited to correlation/regression analysis. Your desired outcome, in statistics, is referred to as contingency analysis (http://en.wikipedia.org/wiki/Contingency_table). From the description of your data, you can generate the data for a contingency analysis using a series of spatial joins (ArcToolbox > Analysis Tools > Overlay > Spatial Join) and then calculating the associated frequencies. The de facto statistical test on a contingency table is a Chi Square.