AnsweredAssumed Answered

Suggestions for an Algorithm for Assigning Values Based on Spatial Proximity

Question asked by rfairhur24 Champion on May 3, 2018
Latest reply on May 7, 2018 by rfairhur24

I am trying to come up with an algorithm that can handle the problem of assigning the values of an attribute of a set of polygon features in one feature class to the set of polygon features in another feature class based on proximity.  I can conceive of several ways to approach this problem using Python and/or Geoprocessing tools, but I need to optimize the final algorithm for performance.  I am hoping for python coding or geoprocessing suggestions that could accomplish each step in the best order to achieve the greatest overall efficiency and speed.  The suggestions do not have to tackle the whole problem and can be focused on any part, but I am laying out the complete problem so each step can be seen in light of the overall goal of the algorithm.


Assume I have two polygon feature classes named FC_A and FC_B which have two attribute fields called Case_Field and Value_Field in both feature classes.  The following conditions must be met for the algorithm to reach a solution:


1. All conditions listed below should only consider and compare the set of features that have a common Case_Field value in both FC_A and FC_B and if the set of features is not limited to a single Case_Field value then the number of overlapping features in both feature classes is massive.

2. A solution is only reached when every unique value in the Value_Field of FC_A has been be assigned to at least one feature in FC_B and all of the features in FC_B have been assigned a value in the Value_Field from the set of unique values in FC_A.

3. The total number of unique values in Value_Field in FC_A must be equal to or less than the number of features in FC_B to be analyzed by this algorithm, otherwise the set of features associated with that Case_Field value will be ignored.

4. If a feature in FC_A does not overlap any of the features in FC_B the value in Value_Field of FC_A must be assigned to the closest feature in FC_B, provided that the FC_B feature has not previously been assigned a value from FC_A to satisfy this condition.

5. Any feature in FC_B that is overlapped partially or completely by only one feature in FC_A will be assigned the value from the FC_A  feature that overlaps it, provided that the FC_B feature has not already been assigned a value to satisfy condition 4 above.

6. Any feature in FC_B that has portions overlapped by two or more features in FC_A will be assigned the value from the FC_A feature that overlaps its centroid or that is closest to its centroid if the centroid is not overlapped, provided that the FC_B feature has not already been assigned a value to satisfy condition 4 above.


An illustration of the problem is shown below for one of the sets of features in FC_A and FC_B that all share a common Case_Field value.  The polygons with a grey fill and a colored outline are in FC_A and are labeled with the value stored in the Value_Field.  The polygons that are colored light purple are in FC_B and have Null in the Value_Field.  (Note: some of the values in the Value_Field of FC_A are actually associated with many polygons, but they should all be treated as a single dissolved polygon for the purposes of this algorithm.)

Input Polygons for a given Case Value in FC_A and FC_B

The solution should look something like the output below.  The FC_B fill and the FC_A outline have been assigned the same color for each value in the Value_Field

Output of FC_A and FC_B solution

Notice that values from the polygons in FC_A that do not overlap any of the polygons in FC_B have still been assigned to one of the FC_B polygons.  I did this manually, so I may not have chosen the closest polygon to satisfy condition 4, but I would expect the algorithm to chose the closest polygon.


If it turns out that after fully optimizing the steps that evaluate spatial proximity those steps would take 5 or more minutes to reach the solution of the sample problem above, I would probably end up settling for a solution that only satisfies conditions 1 through 3 using the fastest python algorithm that can randomly distribute the values from FC_A into FC_B.  However, I prefer a solution that takes the spatial relationship into account.