Spatially joining polygons centroid within

4458
12
12-27-2017 02:45 AM
DavidMatthews2
New Contributor II

I have two polygon feature classes that I want to do a spatial join between and sum certain fields. The target feature  has larger polygons (regions) covering the second feature of smaller polygons (property parcels). I would like to use the centroid of the property parcels to do the join as they are not always fully contained in the regions but 'has their centre in' only joins the opposite way, with the target feature being the smaller polygons. 

Is there anyway to do this without creating a point feature from my parcels and using 'contains' as my join?

 

Thanks

12 Replies
DanPatterson_Retired
MVP Emeritus

switch the order of the join, then do a an attribute join in the opposite direction perhaps

XanderBakker
Esri Esteemed Contributor

You posted this in the Python space. Are you looking for a script to do this? In case the simple way as Dan mentioned, doing a manual spatial join does not work for you, there is always a scripting solution. However, that will take a some time to complete. How much coding experience do you have and what have you tried so far?

I assume that you want to obtain statistical information from multiple parcels for each region and add that data to the regions, right?

How large are your datasets? How many features do you have in both featureclasses? 

As a general outline of what could work I would be thinking along the lines of:

  • Creating a dictionary with the centroids of the parcels and corresponding numerics data on which to apply the statistics oid : [centroid, numeric fields]
  • Creating a dictionary of the regions (oid : polygon)
  • Loop through the regions and for each region looop through the parcels and check if the data is inside and populate a dictionary with a list of the parcel numeric data per region
  • Perform the statistics on the list pf parcels per region
  • Add the resulting statistics to the regions

If the datasets are large, it would be better to use select by location for each region in order to optimize the performance. This steps above only apply when the regions do not overlap and when a parcel centroid will only fall inside a single region, otherwise a parcel will be taken into account multiple times.

DavidMatthews2
New Contributor II

Thanks Dan PattersonXander Bakkerfor the suggestions. My datasets are large with 2m parcels and 200k regions so the dictionary option will be very slow I think.

It seems strange that there isn't a join option opposite to "HAVE_THEIR_CENTER_IN —The features in the join features will be matched if a target feature's center falls within them

that is "HAVE_THEIR_CENTER_IN —The features in the join features will be matched if the join features center falls within the target feature." 

DanPatterson_Retired
MVP Emeritus

given the size... I would consider tiling the data first, this is a classic case where tiling would be easy since it should be fairly obvious which polygon belongs to what... even with overlap if needed.  The mere size of the files will bog processing down, so the workflows suggested should account for most situations and you can deal with the remaining separately.

XanderBakker
Esri Esteemed Contributor

As you already mentioned, having 2M elements in a dictionary will be very slow, since a dictionary takes about 3x the size as overhead. 

In addition to what Dan Patterson  already mentioned, if there is any attribute that can be identified in both datasets (like a neighborhood or something a little bigger) you could base the "tiles" on those areas.If there is no common attribute then tile the area like Dan mentioned (perhaps with a little overlap just in case) and process those. Doing 200k select by locations on a dataset of 2M features will really take a lot of time and that is not the way to go.

Is it possible to post a part of the data (say 1 region and the corresponding parcels for that region)?

RobertStevens
Occasional Contributor III

Amen to what David said. Why is there no "have their center in". Is there a workaround? Sure. But I have a better workaround: have ESRI actually make logical design decisions, and better software.

MarianneRohrbach
New Contributor III

You may want to have a look at the Geoprocessing Tool arcpy.TabulateIntersection_analysis, probably followed by Sort and SummaryStatistics to select best match (first row).

0 Kudos
Jan_PeterGlock1
New Contributor II

[Edit: "A's centroids are always within B (but B's centroids are not always in A)" is wrong and should state "B's centroids are always within A (but A's centroids are not always in B)"]

This is no question:

I have the same problem as David however, the "do it the opposite wy and do an attribute join" solution is not an option for me. This is because my targets (A) are only partially in my joins (B) and the other way round. They just interesct. However they intersect with neighbouring features aswell. The only usefull spatial relation to join them is the centroid. A's centroids are always within B (but B's centroids are not always in A). Since I am far from being a coder I will have to convert the join features into points as David suggests.

In that sense, I would like to suggest that Esri adds the needed match option in an upcoming uptdate. It seems fundamental anyway.

0 Kudos
MarianneRohrbach
New Contributor III

Could you explain why you would want to multiplicate potentially large B features to join attributes of A ? When joining this way you will end up with duplicated B geometries for every A with centroid in B and lost A geometry.
What is the use case for such a requirement?

0 Kudos