Problem while Geoprocessing (Intersecting) large amount on data in Arcpro and Arcmap.

7208
21
Jump to solution
01-26-2018 06:34 AM
chiranjeevijanjanam
New Contributor II

Hi,

I am doing my Master thesis with GIS Data where I have to churn a large amount of data, I have more than 2 million rows, which I must intersect with another map to get the common attribute table which I use in other software to get the required results, but processing itself is taking hours, yesterday I started it on ArcMap and it ran for 18 hours but just said "drawing features 36,000" at which point I thought it would take months to complete the operation and installed pro as i saw this thread.Dissolving Large Data 

But still the problem is persisting and I ran into a problem again, I waited for 3 hours at which point Pro returned an error and started processing again, can you guys say how to go through this large data intersect.

0 Kudos
21 Replies
chiranjeevijanjanam
New Contributor II

Thank you, Ken.

.....data removed....

0 Kudos
DanPatterson_Retired
MVP Emeritus

Some suggestions.

create a featureclass from the shapefile will improve things (assume it should be in the same geodatabase)

prior to doing any intersections, make sure that you have made the  featureclass with the join permanent (ie a new featureclass)

limit your area of interest prior to doing any intersection.  In other words are there areas in both files that you know won't intersect? you want to remove having to check geometries against geometries that won't participate in an intersection at all.  This could be simplified by doing a select by spatial location to limit to only the overlapping areas.

chiranjeevijanjanam
New Contributor II

Thank you worked for me now, it worked in Arcpro and pairwise intersect took just less than 2 minutes, and normal intersect took around 15 minutes, after following all your steps carefully it worked pretty good, checked 3-4 times all your suggestions and followed them.

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

I agree with Dan, work in the native file formats for the best performance.  Working with shape files, CSV files, Excel files, and doing joins and geoprocessing is a recipe for disaster.  We have come across several geoprocessing operations in Pro that take 10x longer working with shape files than native file formats.

chiranjeevijanjanam
New Contributor II

your answer is also right, same as above...it helped me a lot to do work efficiently.

0 Kudos
VinceAngelo
Esri Esteemed Contributor

Two million rows isn't all that much with modern computers.  I regularly process 20m x 60k Intersect operations, and they rarely run more than an hour.  8Gb RAM isn't all that much -- 16Gb is a modern low-end RAM allocation. 

But more important than RAM is the type and speed of your disk.  A laptop with a clunky old 40ms HDD seek disk would take a compute-year or more to do what a hot new <1ms SSD can do in minutes.

- V

ColeAndrews
Occasional Contributor III

Vince Angelo‌ you're accomplishing that in what? Surely not in Desktop...

0 Kudos
VinceAngelo
Esri Esteemed Contributor

Yes, ArcMap 10.4.1 (32-bit ArcPy), in fact (and don't call me Shirley )

- V

ColeAndrews
Occasional Contributor III

That's solid...I want the insiders secret

I'm running intersects in Pro using NVMe SSD, the best there is, and Pro would 'Shirley' crash before it ever completed 20m points to 60k polygon intersects.

Alteryx is the fastest analytics software with a GUI I've worked with, and it took 2 days for us to intersect 30 million points with 1k drive time polygons.

0 Kudos
KenHartling
Esri Contributor

Please help us by sending us the case you see 'Shirley' crashing so we can take a look.  Either via your support contact or we can arrange for you to send it to me directly.  It would also be good to know your machine specs.

I'd love to see your '2 days for us to intersect 30 million points with 1k drive time polygons' case as well.  2 days seems a little too long to me unless the machine it is being run on is pretty slow or doesn't have adequate resources.

Pro has had the ability to run most of the overlay tools in parallel for some time (set arcpy.env.parallelProcessingFactor = 100 before running the tool). If you have a machine that can handle it, you may be able to get more performance out of many overlay operations.

Thanks.

0 Kudos