I have a large dataset where I have summed point values in polygons. I have a separate value per year, over five years. I want to find the trend in each polygon over time. I'd like to set up a regression to find the coefficient for change (positive or negative trend) in each polygon, but can't figure out how to set this up. Any suggestions? Thanks!
Solved! Go to Solution.
Alison, you can run the expression I provided in the field calculator. I am also going to suggest that you look at grouping analysis as a way of characterizing how the polygons are changing over time: Grouping Analysis—Help | ArcGIS for Desktop
You could do this painfully using a combination of field calculator, summary statistics, and joins, however I think you'd be better off exporting your table to Excel or dedicated statistical program, perform your regression, then join the result back to your polygons.
Yeah, that's what I was thinking I might have to do. I know I have seen some people complete regressions in arcgis, and was hoping there might be a way to set this up.
I'm not sure if that's necessary, but I do know that would work.
This tool promises to do simple regression, but I've never used it:
Regression is the last thing you should be contemplating unless the data meet the underlying assumptions required for the test (see the link in Darren's post and its requirements). Have you completed your descriptive statistics and graphing? Is there any reason that the sum of the points in the polygon should vary over time? Do the individual observations exhibit the same pattern? In fact, a brief summary of your data might bring forward other suggestions.
Hi Dan,
In response, yes, there is great variation in the points over time, as I am measuring the frequency of fishing in certain ocean areas. What I have is the average number of times a fishing vessel entered a certain area, which was averaged over a 5-year period. I am trying to find out the magnitude and direction of the trend. Perhaps there is a better way to go about this?
Hi Alison. Dan is right. You would be violating a lot of assumptions with a regression model.
If I understand correctly, you have five time values ( 1- 5) analogous to independent variables, and some other value, analogous to a dependent variable for each of your polygons. I think you can do a brute force calculation which will estimate the slope of the best fit line for your five data points.
This is what OLS does when calculating the slope in a linear regression model. You will have to do this for each polygon, and there is no intrinsic measure for goodness of fit. At a minimum, look for a linear relationship over time.
Ordinary Least Squares is very powerful, is well documented in ArcGIS, and includes great diagnostics.
Thank you Carl, that is helpful. What I have is the average number of times a fishing vessel entered a certain area, which was averaged over a 5-year period. What I wanted to find was the coefficient, to see if the trend is increasing or decreasing over time - e.g. is the polygon becoming more of less valuable to fishing vessels over time. I have over 4,000 polygons so cannot do this manually, so wanted to set up an automated process - I tried the OLS in arc but not sure I am understanding the outputs. I am not trying to conduct a multiple regression, I just wanted to know the direction and magnitude of the trend. Should the OLS tool do this for me?
Alison, you can run the expression I provided in the field calculator. I am also going to suggest that you look at grouping analysis as a way of characterizing how the polygons are changing over time: Grouping Analysis—Help | ArcGIS for Desktop
Hi Carl, thank you, I was able to use the expression - appreciate the help!