I have published a geocode service from a composite locator which was created via the create locator tool in ArcGIS Pro 2.6.1. The Locator covers the entire state of New Jersey so it includes nearly 4 million address points along with road centerlines. With that being said maximizing performance has been a big issue. I have a table of address that I batch geocode in ArcGIS Pro as part of the performance testing. This table takes me 4 minutes and 30 seconds to complete. I noticed however, that when I leave out the state from my inputs the table takes 56 seconds to complete. The state values are all NJ in the input data used to create the locator and they are all NJ in the table I am batch geocoding. I am curious why there is such a big difference in performance and how I can assure that faster performance is achieved by our users. I could leave the state values out of my input data or provide guidance to users to not enter a state, but neither of these are ideal.
Just to clarify, you used the Create Locator tool to create two locators and added them to a composite locator and not the Create Address Locator tool to build the locators in the composite?
Do either of these locators have any alternate name tables linked to them?
For better performance I would suggest creating a multirole locator with the Create Locator tool instead of multiple single locators added to a composite locator. I would also make use of the tips to improve performance described here, Tips for improving geocoding performance—ArcGIS Pro | Documentation .
Can you provide any details about the field mapping and any geocoding options set in the participating locators in the composite? Are there any IDs like for street that could be used to link the points to the street centerlines?
Joe, it is possible, but not recommended if you are able to combine all of your data for a single role together. Better performance is achieved with a multirole locator and being able to minimize duplicate results.
Correct, I am creating two separate locators and adding them to a composite. Unfortunately combining them into a multirole locator does not seem to work because even though I enter the address points above the roads that hierarchy does not persist in the results. So, there could be a match in my address points that has a score of 91 and a match from my roads input that gets a score of 96. Ideally I want any match from the address points over 85 to be returned even though the score is lower than the match from the roads. Perhaps I am missing something but I only see a way to set score thresholds for all roles collectively rather than on individual roles to prevent this from happening.
Both of my locators have an alternate city name table and an alternate street name table. I have tried tweaking the settings provided in the performance documentation and they don’t seem to speed up batch performance. There is an id that links the address points to the street centerlines, how can I use that my advantage?
Thank you so much for your help!
What is the reasoning behind preferring a lower scored PointAddress? It is interesting that the PointAddress match is lower than the StreetAddress match. Given the additional information you provided about the alternate city and street name tables, I believe that an issue with linking the alternate city table in the two locators maybe the cause of the poor performance. Is the alternate city table formatted like the following or are there duplicate city names in the alternate name table? If there are duplicate records in the alternate city table it can create additional records that are not needed in the locator, which causes the locator to perform slowly because it has to search through the additional index to find the best match. This is multiplied across both locators, which causes the composite to perform slowly as well as the individual locators.
As for the duplicate results and different scores, yes it is because there is no postal code for the PointAddress record so it is returned with a lower score. The best way to fix this is to have both datasets have the same fields. That would be by either enhancing the PointAddress dataset to include the postal code with the data (this can be done by overlaying postal polygons to apply postal codes to different points that fall within the postal polygons). Another option is to not map the postal code for the StreetAddress locator when building it. This is not ideal but would give back better ordering of the results because the scores would be the same. I would stick to option 1 if that is possible.
As for the joining of alternate city names, it is better to associate an ID with all cities with the same alternate city name and then the alternate city name table would have a single record for that alternate city name and would get linked with all of the records that had that ID. This is more optimal and will make the locator smaller and faster.
Hi Brad, thank you for your response.
Yes it sounds like enhancing our address points is the way to go. Also, I am having a hard time envisioning how I could slim down the alternate city name table as you mentioned in instances where an address point has multiple alternate city names. Would you be able to provide an example or diagram of this?