Select to view content in your preferred language

Select by Location glacially slow?

81
2
Jump to solution
7 hours ago
Labels (2)
IanESRIHogan
New Contributor

I have a file geodatabase that contains a single layer with roughly 1.1 million point features. I have a shapefile with a single polygon feature in it from GADM.org that has the US Country boundary. Hard drive that these datasets are on is SSD. 

I'm attempting to use Select by Location to figure out all the points that do not intersect the US. It appears that this will take hours. Is this normal? ArcGIS Pro is only using 8% of CPU (machine has 8 cores), and is only using about 2GB of RAM (machine has 32 GB).

I created a spatial index on the point feature layer and when zooming / panning, everything is drawn quickly.

Am I going about this wrong

 

0 Kudos
1 Solution

Accepted Solutions
MobiusSnake
MVP Regular Contributor

The single-feature GADM.org boundary for the US is an extremely complex polygon with over two million vertices.  When polygons get that complex it becomes difficult to perform spatial operations against them.

When I downloaded the boundary from GADM.org it included three shapefiles, one suffixed with a "0" (the entire country), one suffixed with a "1" (state-boundaries) and one with a "2" (county boundaries I think).

Although it's counterintuitive, if you use the county boundaries containing over three thousand features, you'll get better performance than you will with the single-feature boundary, because those three thousand features are much simpler shapes.

I tested this against a 12M+ point feature class I have (points around the globe) and the "0" shapefile was stuck on 1% ... switched to the "2" shapefile and it completed in about two minutes.

View solution in original post

2 Replies
MobiusSnake
MVP Regular Contributor

The single-feature GADM.org boundary for the US is an extremely complex polygon with over two million vertices.  When polygons get that complex it becomes difficult to perform spatial operations against them.

When I downloaded the boundary from GADM.org it included three shapefiles, one suffixed with a "0" (the entire country), one suffixed with a "1" (state-boundaries) and one with a "2" (county boundaries I think).

Although it's counterintuitive, if you use the county boundaries containing over three thousand features, you'll get better performance than you will with the single-feature boundary, because those three thousand features are much simpler shapes.

I tested this against a 12M+ point feature class I have (points around the globe) and the "0" shapefile was stuck on 1% ... switched to the "2" shapefile and it completed in about two minutes.

IanESRIHogan
New Contributor

Thank you so much, that was exactly the problem. I switched over to 2 and the whole thing completed in 30 seconds. Really appreciate it

0 Kudos