Generate near table too slow for large dataset?

AndrejGrah · ‎01-30-2013

Hi,
I have noticed that "generate near table" is working really (too) slow for large dataset. I have 43 millions of points in a shape file and I want to get 10 closest neigbours for each point within radius (in worst case that means 430 millions of records in near table). Shape file is located on local HDD. Upper limitations are not exceeded. In 24h only 17% of a task was done:(

arcpy.GenerateNearTable_analysis(shapeFile, shapeFile, outputNearTable, radius, "NO_LOCATION", "NO_ANGLE", "ALL", 10)

It takes only 20s for 100k points.

Any suggestions how to speed up?

AndrewChapkowski · ‎01-30-2013

If you have a search radius, you can try extracting your points first to a temporary feature class by using the clip method, then running the near tool.

You might see a performance gain if you move away from shape files and use a file geodatabase featureclass. Also, try calculating spatial indexes on your point feature class.

JustinMeyers · ‎02-04-2013

shapefiles have a 2GB limit (http://webhelp.esri.com/arcgisdesktop/9.3/index.cfm?TopicName=Geoprocessing_considerations_for_shape...). As mentioned above, make a subset of the data that you know falls within the area. Sometimes building a spatial index takes longer than the query itself...

If I were doing this I would select features within the distance needed and create a new subset. Then run the tool on the data. If the dbf becomes larger than 2GB it is junk.

Good Luck!

RyanClancy · ‎02-04-2013

You might try creating a subset as suggested above but instead of creating a new file, put the subset in memory. Where the tool you're using calls for an output parameter use "in_memory\dataset_name". I find that working in_memory is a lot faster, provided that you've got a decent computer of course. You can write your NearTable as a file somewhere since it's the final output.