We use StreetMap Premium geocoder, and just installed 2014R3 version. However, compared to the old one we had (2012R3, SDC based), this new geocoder is about 3 times slower. For example, the geocoding speed was about 130,000records/hour for the old one (using ArcMap 10.0), and 45,000records/hour for this new one (using ArcMap 10.3).
I checked computer performance while geocoder was running, it doesn't look like CPU nor memory has reached its capacity, so geocoder performance should not be limited by my hardware or resource sharing. I am thinking if there is any major change in the geocoder itself that could lead to such a dramatic change??! I saw posts mentioning GDB vs. SDC. Anybody has the same problem?
I encountered a number while googling:
https://www.esri.com/library/whitepapers/pdfs/arcgis93-geocoding-technology.pdf
page 6 mentioned desktop speed of 2,250,000/hour for batch geocoding (though it's an older version). I would be happy to get one tenth of it. How can I make the new geocoder run faster? Thanks a lot!
Solved! Go to Solution.
Jian,
Depending on the way you are geocoding (Single Field vs Multiple Fields), geocoding speed can very. Multiline should be much faster than you are describing so lets take a look at a couple things you can do to improve the performance of the locator.
1. We can enable multithreading in the composite locator (off by default for SMP) but I wouldn't do this unless you have at least 4 cores available to use. You can enable this by going to the composite .loc file, opening it with a text editor, and switching "UseMultithreading = false" to "UseMultithreading = true". If the property isn't there, just add it.
2. Another property is that can help with performance is the RuntimeMemoryLimit. This can also be found in the composite .loc file. Depending on the amount of RAM on your machine, I would suggest setting it to something around 2048 MB (it may already be set at that and if it is, just leave it).
3. If you have a solid state drive, store the locator on that. This will really help the performance.
Brad
Just to add more information --
The geocoder is a composite geocoder, stored in a folder (not SDE etc.).
Jian,
Depending on the way you are geocoding (Single Field vs Multiple Fields), geocoding speed can very. Multiline should be much faster than you are describing so lets take a look at a couple things you can do to improve the performance of the locator.
1. We can enable multithreading in the composite locator (off by default for SMP) but I wouldn't do this unless you have at least 4 cores available to use. You can enable this by going to the composite .loc file, opening it with a text editor, and switching "UseMultithreading = false" to "UseMultithreading = true". If the property isn't there, just add it.
2. Another property is that can help with performance is the RuntimeMemoryLimit. This can also be found in the composite .loc file. Depending on the amount of RAM on your machine, I would suggest setting it to something around 2048 MB (it may already be set at that and if it is, just leave it).
3. If you have a solid state drive, store the locator on that. This will really help the performance.
Brad
Brad, thank you for the suggestions again! I've tested the two parameters as you've suggested, the performance IS much better now at around 87000records/hour. Thank you!
It is two times faster, but still much slower than the older version. I would like to get a comparable speed if possible. I also tested RuntimeMemoryLimit = 4096MB out of curiosity, but it didn't help. What else could hinder the performance?....
Jian,
A couple questions for you.
1. What are the specs of your computer?
2. Are you using Single Field geocoding or Multiple Fields geocoding?
3. Are you running off of a HDD or SSD?
Brad
1. OS is Windows Server 2008 R2 standard, SP1
CPU: Intel Xeon X5660 @2.80GHz
RAM: 8G
64bit
2. I use Multiple fields geocoding
3. It's a virtual machine, and I will have to ask our IT department to find out. How much improvement we may see from a HDD to a SSD just in case?
Jian,
Everything looks good with the specs and I am glad to see you using Multiple fileds, it is much faster.
Virtual machines can have some issues depending on how the environment is set up. Physical hardware is always preferred because you know what you are really getting.
As for performance difference between HDD and SSD, it can be significant when doing Desktop geocoding because nothing is hot in RAM. The initial overhead reading the indexes from disk is a real bottleneck (which is why we use a cache to store recently searched features in RAM which is what the RuntimeMemoryLimit controls).
One more thing, if it is possible for you to do, is sorting the input table by state, then city, and then postal. This will make the reads from the disk more efficient because it is reading from the same physical area.
Hope this helps you some more.
Brad
Brad,
To be clear, what indexes are you referring to by "The initial overhead reading the indexes from disk". Is it loading reference data in .lox file to the memory? I see ArcMap freezes for a while when I picked the locator and I thought that's when the indexes are loaded.
Also I noticed the speed was around 180,000records/hours when the geocoder initially started working, but it dropped off to 80,000records/hours quickly in a few seconds. Not sure this might indicate anything.
Thanks for the sorting suggestion. I will keep it for use against really huge data.
In general, how would you comment on the rate of 87000records/hour? Is it normal? Or way below your benchmark during testing the geocoder? If it's within the normal range, I think I will leave it there. Thanks a lot for the help.
Jian
Jian,
It is much lower than I would expect even for an HDD. On HDD I would expect to see something closer to 250,000.
Brad
Brad,
Just want to share some new discovery with you --
The culprit for the slow performance seems to be due to the change of Min Match and Min Candidate Scores. The 87000records/hour performance is when those two were changed (e.g. to 93), but when I use a "clean" .loc and .xml files, the performance is back to about 200,000records/hour. I also noticed there are quite many other configuration items added to the .loc file when those two scores were changed through ArcCatalog.
Regards,
Jian