Generate near table too slow for large dataset?

3232
3
01-30-2013 01:19 AM
AndrejGrah
New Contributor
Hi,
I have noticed that "generate near table" is working really (too) slow for large dataset. I have 43 millions of points in a shape file and I want to get 10 closest neigbours for each point within radius (in worst case that means 430 millions of records in near table). Shape file is located on local HDD. Upper limitations are not exceeded. In 24h only 17% of a task was done:( 

arcpy.GenerateNearTable_analysis(shapeFile, shapeFile, outputNearTable, radius, "NO_LOCATION", "NO_ANGLE", "ALL", 10)

It takes only 20s for 100k points.

Any suggestions how to speed up?
0 Kudos
3 Replies
AndrewChapkowski
Esri Regular Contributor
If you have a search radius, you can try extracting your points first to a temporary feature class by using the clip method, then running the near tool.

You might see a performance gain if you move away from shape files and use a file geodatabase featureclass.  Also, try calculating spatial indexes on your point feature class.
0 Kudos
JustinMeyers
New Contributor III
shapefiles have a 2GB limit (http://webhelp.esri.com/arcgisdesktop/9.3/index.cfm?TopicName=Geoprocessing_considerations_for_shape...). As mentioned above, make a subset of the data that you know falls within the area. Sometimes building a spatial index takes longer than the query itself... 

If I were doing this I would select features within the distance needed and create a new subset. Then run the tool on the data. If the dbf becomes larger than 2GB it is junk.

Good Luck!
0 Kudos
RyanClancy
Occasional Contributor
You might try creating a subset as suggested above but instead of creating a new file, put the subset in memory. Where the tool you're using calls for an output parameter use "in_memory\dataset_name". I find that working in_memory is a lot faster, provided that you've got a decent computer of course. You can write your NearTable as a file somewhere since it's the final output.
0 Kudos