I'm experiencing terribly slow performance when eliminating small polygons within a file geodatabase feature class using the Eliminate tool. This is particularly noticeable when running the tool on larger datasets. For example, in one Eliminate process I'm trying to eliminate 4,784 polygon features of 2,824,226 and this tool is still running after 8 days on a newer Razer gaming laptop running Pro 2.6.3 (image provided). The tool message remains at Processing Tiles and that process has repeatedly hit 100% but then jumps back down again. In a second Eliminate on a high end HP ZBook G5 with 128 GB RAM running Pro 2.4.2, trying to eliminate 90,664 of 764,904 polygons is still ongoing after about 20 hours. In another example using 2.4.2, I eventually ended the process using task manager after about 5 days and tried a FME Desktop workflow (with mixed results using the AreaGapAndOverlapCleaner transformer).
I've also noticed that if/when Eliminate does complete, it doesn't actually eliminate all the selected features. I've had to run Eliminate multiple times to actually complete what I'm interested in eliminating (e.g. eliminate all polys < 5000m2).
Eliminate is one of the most computationally intensive process you can undertake.
Its raster equivalent is much faster. Since you suggest you are doing this regularly, how is the data getting into this state in the first place? Are you using topology at all?
Hello....there are multiple reasons why some of our datasets have small polys. For example, updating provincial scale forest inventories of various temporal vintages with historic harvest/natural disturbances, admin boundaries, etc.....or creating polygons from multiple rasters after they've been combined. I do use topology; however, don't believe setting tolerances there would work when trying to eliminate areas under 0.5 ha. I have used Lookup as the raster equivalent of Dissolve but am not familiar with the raster equivalent of Eliminate. What would that be?
As for performance, I can appreciate the computational intensity but think waiting days (8 days and counting) for a process to complete on high end hardware is unreasonable. It would be nice if Eliminate could utilize cartographic partitions or something similar to improve performance.
I'm reluctant to start clipping datasets and running Eliminate on these smaller subsets; however, may end up doing so. I'm also going to explore the equivalent command in QGIS.
It may not be appropriate if you have to convert back to vector...
but for integer raster data, it entails using RegionGroup to create individual zones from the inputs, then query for all the areas less than your threshold size, using SetNull to make those areas nodata, then Nibble to fill in the nodata areas with the values that bound the nodata zones (essentially filling in the gaps).
Following up on this posting, I'm grateful that Esri Support Services reached out to me regarding the challenges I was having. As a result of that process, the following enhancements and bug were logged:
ENH-000137016 - Create a Pairwise Eliminate Equivalent Tool, or Expose Eliminate to the Parallel Processing Factor
ENH-000137019 – Create a Pairwise Symmetrical Difference Tool
BUG-000137018 - [Data-Specific] The Eliminate Geoprocessing Tool Doesn't Eliminate All Qualifying Features on an Initial Pass; Requires Multiple Passes to Fully Eliminate All Features
Shout-out to Bradley at Esri for reaching out to me about this and following through on this with the Product Team.
So, is the Eliminate tool basically useless on large datasets? I'm trying to eliminate 13,000 polygons of less than half an acre out of a 334,000 polygon dataset. The tool ran for hours and had the same performance as noted above - continually cycling to 100% and then starting over....I let it run over the weekend, and find that Pro crashed sometime over the weekend.
Nice to know Esri has documented bugs, but why is this tool available if it seems not to work?