Resources to process 9 million records in od-matrix

502
3
03-29-2023 04:21 PM
peterweir
New Contributor II

Currently Importing 6.6mil records in OD-Matrix to calculate distance between customer locations and Branches. It's currently processing 900 records per minute. Based on that calculation, it will take ~4 days to complete.

I am assuming, if it finishes and not fail, then to RUN OD-Matrix, it will also take an additional 4-5 days to complete.

This is using a standard desktop ArcGis with 32gb RAM on 12 cores

Can anyone advise if there is an alternative to this 8-10 days process?

Is it simply increasing cpu/ram usage?

or alternate computer-configuration or spatial-functions (Distance/proximity?)

0 Kudos
3 Replies
MelindaMorang
Esri Regular Contributor

Happy to help with that.

I can't tell from my post whether you're using an OD Cost Matrix layer in Pro or using the arcpy.nax solver objects.  The arcpy.nax solver objects have some performance and memory management improvements that are not available with layers, so this is generally a better way to solve very large problems.

You can also get some speed improvements by using a network dataset in a mobile geodatabase instead of a file geodatabase.  You can convert an unlicensed file geodatabase network dataset to mobile geodatabase using the Create Mobile Map Package tool and extracting the resulting package.

The easiest way to solve your very large problem might be to download our parallel OD cost matrix tool here: https://github.com/Esri/large-network-analysis-tools  There are also some resources included from past presentations at DevSummit that show some techniques to use to reduce the problem size.

peterweir
New Contributor II

Hi Melinda - many thanks for replying on what looks to be a good solution to my OD Matrix problem. 

But am having a small problem with the operations using my data

As per attached image, I am using a sample ~28k locational records (people) to be allocated to 2 Destination locations. 

While the process seem to have worked from 'View Detail" messages (see attached), but resultant:

1/ OutputODlinesFeature_v243 to not actual create any feature lines

2/ Also when I joined 'OutputOrigins_v1 with  OutputODlinesFeature_v243, the distance where incorrect

Just wondering if its related to my southern hemisphere Lambert conformal projection?

Any help will be greatly appreciated

Regards Peter

0 Kudos
MelindaMorang
Esri Regular Contributor

Hi Peter.

Regarding question 1: Correct, by default, the tool does not generate line geometry in the outputs, as my research indicated that most users didn't need the lines and are mostly interested in the table.  But you can override this default behavior.  In the location where you saved the toolbox files, find the file called od_config.py and change line 35 to 

"lineShapeType": arcpy.nax.LineShapeType.StraightLine,
Note that with OD Cost Matrix, your only options are straight lines or no lines.  The driving time and distance is calculated along the network, but the output geometry can only either be straight lines representing the connections or no lines (this is the same as with layers).
 
I'm not sure about question 2.  How do you know the distances are incorrect, or in what way are they incorrect?  It won't be related to your projection or coordinate system.  That should just work.  More likely would be some incorrect mapping between the Origin and Destination IDs and the values in the Lines table, either on your end or in the tool's code.  The tool's code has been tested and should be okay, although I do see that you're using Pro 2.9.  We made some fixes to the arcpy.nax solver objects in 3.0 (I think...hard to remember) to resolve some issues with relating IDs in the outputs.  It's conceivable that you're running into an older Pro bug that is now resolved.  But that's just a guess, so let's rule out anything on your end before we go down that rabbit hole.  But if you have the ability to update your software, that will never hurt!
0 Kudos