AnsweredAssumed Answered

Spatial Join returning incorrect join_count

Question asked by SVisovsky on Jul 10, 2015
Latest reply on Jul 10, 2015 by SVisovsky


I've been working on a geoprocessing model for a while now and I've been trying to track down what I thought was a logic problem with it.  It turns out it wasn't a problem with the logic behind the model... it was a problem with that workhorse of geoprocessing tools, Spatial Join.  I'm using spatial join with a 15 foot search radius to find where points in one feature class have more than one point from another feature class within 15 feet.  The truly perverse thing about this is that this is the 2nd of 4 such tests in the model and only this one has this problem.


The basic premise is this:  I an existing point layer for each of Trees and Planting Spaces in a database.  We're conducting a largely crowd-sourced data collection effort to acquire all the street trees in the city and need to push those trees into the existing database.  So we get a feature-class of the trees collected in the census and run some checks on it to see if it's a tree that's already in our system.  We anticipate the new tree census data to be much more spatially accurate but less accurate in terms of attribute information.  After running through my model a given census tree should fall into one of the following buckets:

  1. Too many census trees within 15' of the same existing tree
  2. Too many existing trees within 15' of the same census tree
  3. 1:1 census:existing trees, but the diameter-breast-heights are too different
  4. 1:1 census:existing trees, but the genus do not match
  5. 1:1 census:existing trees, existing tree snapped to census tree location
  6. no existing tree within 15', census tree appended to existing tree feature class


All this works fine... except for the spatial join that determines whether or not there are too many census trees for an existing tree.  The procedure for this (and for determining if there are too many existing trees for a single census tree) is as follows.  Perform a spatial join with existing trees as the target FC and census trees as the join FC.  This is a one-to-one join with a search radius of 15 feet.  Then join the output back to the census FC based on the unique tree ID.  Use the select layer by attributes tool to select all records with a join_count (a field automatically created by a one-to-one spatial join) greater than 1.  Then use Calculate Field to update a status field to indicate the selected census trees are shared by too few existing trees.  Remove the join and move on to the next part of the model.  See Fig 1 for the model builder view of this process.


Fig 1. (apologies for the size)











The problem I have discovered is that the Spatial Join at the beginning of the process is returning incorrect values for the join_count field.  As an example, see Fig 2.  which shows where there are two census trees (blue, "Fraxinus") that should both match with the existing tree (red, "Fraxinus pennsylvanica - green ash") as they are within 15' of it.  Several otherwise-successful attempts at running the model have produced a join_count of 1.  I opened the layers in an mxd for checking and ran the spatial join through that mxd 5 additional times.  The first time for these points it produced a join_count of 1.  The second time, with no other changes besides the output file name it produced a join_count of 0.  The three subsequent times again with no other changes besides output file name it produced the correct_join count of 2. I ran the model again after resetting the data... and got a join_count of 1 again for these points.   


The little dots inside the blue circles indicate that the tree in question was added to the existing database.  The two trees in the corners of Fig 2. were added (rightly so) to the existing database.  The two indicated by measurements were added to the existing database in error because of this spatial join problem.


Fig 2.




Now, I'm open to the fact that I did something wrong somewhere along the line.  But I'm pretty sure I've accounted for everything.  If spatial join is misbehaving in such a fundamental way it is truly disconcerting as it is one of the most commonly used tools.  Let me know if there's anything I need to clarify.