I'm a GIS undergrad student involved with a project where I'm tasked with interpolating water quality data along a winding river (around 500km long but with a focus of a ~100km reach). I'm working mostly with phosphorus concentration values ranging from 0.0015-9.413 The data comes from water quality monitoring stations (point features) that I have snapped to within the river boundary polygon (which I have been using as a break line barrier).
The monitoring station data exists as a point feature class with a point for each measured water quality value. Some of these stations have values collected daily, while others monthly, and many as just a single value collected as part of a single project. This means that in some instances there are dozens (or hundreds) of points laying at the same coordinates. The resulting raster should interpolate using all points collected within a certain time period (mostly within a specific month). The goal is to create a model that performs the interpolation based on a model parameter where the user selects a time frame.
Right now I'm using a model that selects points based on a collection date within a certain month, then interpolates using these points as input, and the river polygon as a barrier. Usually the selected month contains data from 4-7 stations; since some stations have points collected daily there are often dozens of overlapping features at one location with other locations containing only one point. I do not know how this effects the interpolation, ideally the average of all the overlapping values would be used as the value to interpolate with but what happens if I perform the interpolation including all of these overlapping points? Is there a way to automate that the mean of all these values is used? Or that only the median value point remains selected for the interpolation?
So far I've encountered a number of problems due to the winding nature of the river and large spaces between monitoring stations (sometimes several kilometers). All of the resulting rasters have either no interpolated values (only 'no data' in the entire raster), or have interpolated values only for small sections of river (in the areas closely surrounding the point features).
I have tried a number of test interpolations mostly using 'Diffusion interpolation with barriers' and 'Kernal interpolation with barriers' in the geostatistical analyst extension, I have also run some tests with 'Spline with Barriers' in the spatial analyst extension but with poor results.
My main questions (in addition to those above) are as follows:
1) Has anyone done something similar to this? Research on river interpolation has turned up sparse results. Is there a better way to do this than using the interpolation methods mentioned above?
2) What tool parameters should I change to increase the distance for which values are interpolated? Is there something else that I need to do to ensure that the entirety of the study area river polygon contains interpolated values?
3) Is there a way to take into account the flow of the river (as values up stream from a point should have a greater weight than values downstream)? (this is not as important as it falls out of the current scope of the project, but it would be a nice feature to include)
Any help is much appreciated.