I am working on a custom, Python/arcpy-based QAQC process for some of our corporate data. I am using ArcGIS Desktop 10.3 accessed via Citrix. The feature classes being tested are in file geodatabase, with intermediate data being stored in a scratch geodatabase.
Among the checks the tool will perform is a check for overlapping features within an input feature class. My current workflow for this check is:
- Run the Intersect (Analysis) tool on the input feature class, with the "ONLY_FID" option for output fields
- Join the resulting features to the results table
- Use Calculate Field to flag the records which have overlapping features
On small and medium-sized feature classes this works fine. However, I am testing it on our largest dataset, a polyline (contour) feature class with over 3 million records and over a billion vertices in total. The intersect tool uses tiling when run on this dataset, but it takes a long time. As in, after four hours the Intersect tool is only 20% complete. I would love to get the processing time down to a minimum. I've read this help doc about geoprocessing with large datasets, and I've tried using the Dice tool to split the lines up into features with fewer vertices. The process is still extremely slow. Our Citrix servers are temporarily unavailable overnight for maintenance, and although I tried using a "batch" version of the software that supposedly will allow processes to run overnight, when I arrived this morning the application had closed without finishing successfully.
Any ideas? Will topology work faster? Should I go for my own custom tiling scheme? Thanks for any help you can provide.