Dear ArcGIS Community,
As part of my research, I am creating travel cost matrices for ~30k origin locations and ~3k destination locations. Calculating the travel cost matrices for such a large number of origins and destinations is quite time consuming, and currently takes several days to compute. My current workflow is as follows:
- Start with two ".csv" files, containing the latitude and longitude coordinates for origins and destinations.
- Using Python, I create ".lyr" files which contain a street map, origins, destinations, and relevant options.
- Using a ".bat" script, I call the executable file "GenerateCSVMatrixFromOD.exe" and apply it to each ".lyr" file created in the preceding step.
Because there are so many origins and destinations, I am forced to split up my data into chunks before processing. Currently, I am splitting my data into 6 chunks, where each chunk contains all ~30k origin locations and approximately 500 destination locations. Currently, executing the process above for each of the 6 ".lyr" files takes a total of about 3 days computing time. In the near future, I wish to do another experiment which involves calculating, for the same set of ~30k origin locations, the travel cost to almost ~40k unique destinations. At the current rate, this process will take weeks.
My main question is this: is it possible to speed up this process? I could envision opening up multiple ".bat" scripts to process each chunk simultaneously, but I'm not sure how ArcGIS would handle having multiple instances running at the same time, and whether there would be issues with sharing the same geo-database, or pulling from the same set of physical CPU resources. I could also envision partitioning my data in a different, and possibly more efficient way.
What's the best way to speed up this process?
I am happy to provide any additional details that would help to better inform. Thank you for your time.