Select to view content in your preferred language

What is the best way to handle geocoding a weekly list that is 90% the same?

624
9
01-29-2024 09:29 AM
jzcgis
by
New Contributor III

Sorry, the title may be a hard way of conveying what I am looking for. I am not sure if python is the best way to do this (if not please let me know). Currently I have a custom script that does most of this but we are moving all of our production to one of the enterprise servers so things need to change. We have a list of employees that include their addresses (keep in mind that people move so that is an issue). We get this list weekly, in an effort to not use all my credits, I created a locator so that saves us credits. It seems like a simple workflow (get list, geocode, upload to enterprise) but it is not straightforward at all. The list usually has minor changes with mostly being new hires and any moving. Here is basically what I do:

1- geocode with local locator.

2- clean up USER_ (annoying results from geocoding) and delete mystery fields that show up.

3-select the unmatched and export them.

4-delete fields that were added from geocoding.

5- delete the unmatched from original so I can append back when matched.

6-geocode the unmatched with esri geocoder

7-append the newly esri geocoded matched with the geocoded from locator.

separate script 1 :send newly geocoded to locator to keep this for future geocoding.

separate script 2:

1- Delete the feature layer items with the following:

flayers = feature_layer_item.layers
flayer = flayers[0]
max_objid = flayer.query(out_statistics=[{"statisticType":"MAX","onStatisticField":"objectid","outStatisticFieldName":"MAX_OBJ"}], return_geometry=False)
maxoid = max_objid.features[0].attributes['MAX_OBJ']

i = 0
step = 20000

while i <= maxoid:
    i += step
    flayer.delete_features(where=f"objectid <= {i}")

 

This method was the fastest method to delete the features. Everything else took forever.

2- Use append to add the geocoded features to the empty layer. The norm is 90k features but it takes 1-2 hours is that normal?

My questions in addition to those asked is if there is a better way to handle this? Either by python or something else? I am assuming python is the answer since it involves geocoding and uploading to enterprise but not sure if there are other options with esri.

0 Kudos
9 Replies
AllenDailey1
Occasional Contributor III

I would be interested in hearing any suggestions for this task, too!  My organization has a script that does basically the same workflow (but with only one geocoder), plus publishing a map service that's used in an app.  I inherited the script from a past employee, and it's in Python 2.  So now I have to rewrite it to work in Python 3 and ArcGIS Pro rather than ArcMap.  The script is hundreds of lines, and I already know that simple one-to-one substitutions of Pro functions for ArcMap functions is not possible in this particular workflow.  So I'd love to hear anything people have to share on this topic.

jzcgis
by
New Contributor III

Honestly, half of my workflow would be cut in half if ESRI would allow for multiple geocoders to be used when geocoding with the option to use your local one first and then anything not matched to use the ESRI geocoder.

Tom_Laue
New Contributor III

@jzcgis wrote:

Honestly, half of my workflow would be cut in half if ESRI would allow for multiple geocoders to be used when geocoding with the option to use your local one first and then anything not matched to use the ESRI geocoder.


I have been using this methodology.

I first try using one of thecounty Geolocators in our service territory.
Then if no results are returned, I default back to the ESRI geolocator.

 

 

def ESRIGeolocator(addressString):
      ## STEPS
    
    # street address match check

    # street name match check

    # call map grid match






    geocode_result = arcgis.geocoding.geocode(address=addressString, as_featureset=True)
    #geocode_result = geocode(address=addressString, as_featureset=True)
    resultNumber = 0
    maxMatchScore = 0
    matchScore = 0
    bestMatchRecordNumber=-1
    for geoResult in geocode_result.features:
        while resultNumber<len(geocode_result.features):
            #print("made it here")
        
            match_type = geocode_result.features[resultNumber].attributes["Addr_type"]
            matchAddress = geocode_result.features[resultNumber].attributes['Match_addr']
            #print("\t"+matchAddress)
            #print(match_type)
            if match_type == "StreetAddress" or  match_type == "PointAddress" :
                #print("made it here")
                #print(geocode_result.features[resultNumber])
                matchScore = geocode_result.features[resultNumber].attributes["Score"]
                if matchScore > maxMatchScore:
                    maxMatchScore = matchScore
                    bestMatchRecordNumber = resultNumber
##                    print(maxMachScore)
##                    print(bestMatchRecordNumber)
            resultNumber +=1


            if bestMatchRecordNumber!=-1 or maxMatchScore>93.1:
                #print(bestMatchRecordNumber)
                #print(matchScore)
        

                #print(new_geocode_result.features[bestNewMatchRecordNumber])
                longitude = geocode_result.features[bestMatchRecordNumber].geometry.x
                latitude = geocode_result.features[bestMatchRecordNumber].geometry.y

                bestMatchAddress = geocode_result.features[bestMatchRecordNumber].attributes['Match_addr']
                #print(bestMatchAddress)

                matchScore = geocode_result.features[bestMatchRecordNumber].attributes["Score"]
                #print("\t"+str(matchScore)+"% geocode match")
                matchType = geocode_result.features[bestMatchRecordNumber].attributes["Addr_type"]
                #print("\tMatchtype: "+matchType)


                
                #print(geocode_result.features[bestMatchRecordNumber])
            else: # no point matches
                #print("No Matches...Trying with other City Names...")

                maxMatchScore = 0
                citiesInCEG = ['Greenwood','Westfield','Zionsville','Mooresville','Crows Nest','Meridian Hills','Rocky Ripple','Southport','Speedway','Indianapolis','Clermont','Linton','Avon','Brownsburg','Plainfield','Beech Grove','Fishers','Carmel']

                for cityName in citiesInCEG:
                    #addressWithoutCity = addressString.split(",")[0]+ ", Indiana, USA"
                    newCityAddress = addressString.split(",")[0]+", "+cityName+ ", Indiana, USA"
                    #print(newCityAddress)
                    new_geocode_result = geocode(address=newCityAddress , as_featureset=True)

                      ##
                    resultNumber = 0
                    
                    #bestNewMatchRecordNumber=0
                    for new_geoResult in new_geocode_result.features:
                        matchScore = 0
                        match_type = new_geocode_result.features[resultNumber].attributes["Addr_type"]
                        matchAddress = new_geocode_result.features[resultNumber].attributes['Match_addr']
                        matchRegion = new_geocode_result.features[resultNumber].attributes['Region']
                        #print(matchAddress)
                        if match_type == "PointAddress":# and matchAddress.split(",")[0].strip() == addressWithoutCity.split(",")[0].strip() and matchRegion == "Indiana":
                            #print(new_geocode_result.features[resultNumber])
                            matchScore = new_geocode_result.features[resultNumber].attributes["Score"]
                            #print(matchScore)
                            if matchScore > maxMatchScore:
                                
                                maxMatchScore = matchScore
                                bestNewMatchRecordNumber = resultNumber
                                if bestMatchRecordNumber!=-1:                         
                                    #print(new_geocode_result.features[bestNewMatchRecordNumber])
                                    longitude = new_geocode_result.features[bestNewMatchRecordNumber].geometry.x
                                    latitude = new_geocode_result.features[bestNewMatchRecordNumber].geometry.y

                                    bestMatchAddress = new_geocode_result.features[bestNewMatchRecordNumber].attributes['Match_addr']
                                    #print(bestMatchAddress)

                                    matchScore = new_geocode_result.features[bestNewMatchRecordNumber].attributes["Score"]
                                    #print("\t"+str(matchScore)+"% geocode match")
                                    matchType = new_geocode_result.features[bestNewMatchRecordNumber].attributes["Addr_type"]
                                    #print("\tMatchtype: "+matchType)
                        resultNumber +=1

        ##street match search
            if bestMatchRecordNumber==-1:# or maxMatchScore<93.1:                        


                citiesInCEG = ['Indianapolis','Greenwood','Zionsville','Mooresville','Crows Nest','Meridian Hills','Rocky Ripple','Southport','Speedway','Clermont','Linton','Avon','Brownsburg','Plainfield','Beech Grove','Fishers','Carmel']

                resultNumber = 0
                maxMatchScore = 0
                bestMatchRecordNumber = -1
                ###########################################
                ###          street name search
                ###########################################
                resultNumber =0
                for geoResult in geocode_result.features:
##                        print("made it here")
##                        print(resultNumber)
                    match_type = geocode_result.features[resultNumber].attributes["Addr_type"]
                    
                    if match_type == "StreetName":
                        #print(geocode_result.features[resultNumber])
                        matchScore = geocode_result.features[resultNumber].attributes["Score"]
                        if matchScore > maxMatchScore:
                            maxMatchScore = matchScore
                            bestMatchRecordNumber = resultNumber
                            longitude = geocode_result.features[bestMatchRecordNumber].geometry.x
                            latitude = geocode_result.features[bestMatchRecordNumber].geometry.y

                            matchAddress = geocode_result.features[bestMatchRecordNumber].attributes['Match_addr']
                            #print(matchAddress)

                            matchScore = geocode_result.features[bestMatchRecordNumber].attributes["Score"]
                            #print(str(matchScore)+"% geocde match")
                            matchType = geocode_result.features[bestMatchRecordNumber].attributes["Addr_type"]
                            #print("\tMatchtype: "+matchType)

                #print(bestMatchRecordNumber)

                        if bestMatchRecordNumber == -1:
                            
                           for cityName in citiesInCEG:
                                #addressWithoutCity = addressString.split(",")[0]+ ", Indiana, USA"
                                newCityAddress = addressString.split(",")[0]+", "+cityName+ ", Indiana, USA"
                                #print(newCityAddress)#+"     K")
                                new_geocode_result = geocode(address=newCityAddress , as_featureset=True)

                                  ##
                                newResultNumber = 0
                                
                                #bestNewMatchRecordNumber=0
                                for new_geoResult in new_geocode_result.features:
                                    newResultNumber = 0
                                    match_type = new_geocode_result.features[newResultNumber].attributes["Addr_type"]
                                    matchAddress = new_geocode_result.features[newResultNumber].attributes['Match_addr']
                                    matchRegion = new_geocode_result.features[newResultNumber].attributes['Region']
                                    #print(matchAddress)
                                    if match_type == "StreetAddress" or match_type == "StreetName":# and matchAddress.split(",")[0].strip() == addressWithoutCity.split(",")[0].strip() and matchRegion == "Indiana":
                                        #print(new_geocode_result.features[resultNumber])
                                        matchScore = new_geocode_result.features[newResultNumber].attributes["Score"]
                                        #print(matchScore)
                                        if matchScore > maxMatchScore:
                                            
                                            maxMatchScore = matchScore
                                            bestNewMatchRecordNumber = newResultNumber
                                    

                                            #print(new_geocode_result.features[bestNewMatchRecordNumber])
                                            longitude = new_geocode_result.features[bestNewMatchRecordNumber].geometry.x
                                            latitude = new_geocode_result.features[bestNewMatchRecordNumber].geometry.y

                                            bestMatchAddress = new_geocode_result.features[bestNewMatchRecordNumber].attributes['Match_addr']
                                            #print(bestMatchAddress)

                                            matchScore = new_geocode_result.features[bestNewMatchRecordNumber].attributes["Score"]
                                            #print("\t"+str(matchScore)+"% geocde match")
                                            matchType = new_geocode_result.features[bestNewMatchRecordNumber].attributes["Addr_type"]
                                            #print("\tMatchtype: "+matchType)
                                            #print(latitude,longitude)
                                    newResultNumber +=1
                    resultNumber +=1
                    #print(resultNumber)
    if maxMatchScore>93:

##        try:
        latlong = str(latitude)+"+"+str(longitude)
        print(bestMatchAddress)

        #matchScore = new_geocode_result.features[bestNewMatchRecordNumber].attributes["Score"]
        print("\t"+str(matchScore)+"% geocde match")
        #matchType = new_geocode_result.features[bestNewMatchRecordNumber].attributes["Addr_type"]
        print("\tMatchtype: "+matchType)
        print(latitude,longitude)
        return latlong
##        except:
##                latlong = None
    else:
        latlong = None








FullAddress = "500 S Capitol Ave"
city = "Indianapolois"

address = FullAddress
address = address.replace(" ","+")
url = "https://xmaps.indy.gov/arcgis/rest/services/Locators/IndyStreets/GeocodeServer/findAddressCandidates?Street="+address+"&City="+city+"&ZIP=&Single+Line+Input=&category=&outFields=&maxLocations=&outSR=4326&searchExtent=&location=&distance=&magicKey=&f=json"

ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE


html = urllib.request.urlopen(url, context=ctx).read()



my_bytes_value = html.decode().replace("'", '"')
jsonResult = json.loads(my_bytes_value)

#print(jsonResult)

my_bytes_value = html.decode().replace("'", '"')
jsonResult = json.loads(my_bytes_value)

#print(jsonResult)


if jsonResult['candidates']:
    print("result found")

    #print(jsonResult['candidates'][0])
    
    x = jsonResult['candidates'][0]['location']['x']
    y = jsonResult['candidates'][0]['location']['y']
    score = jsonResult['candidates'][0]['score']
    print(jsonResult['candidates'][0]['address'])

    #matchScore = new_geocode_result.features[bestNewMatchRecordNumber].attributes["Score"]
    print("\t"+str(score)+"% geocde match")
    #matchType = new_geocode_result.features[bestNewMatchRecordNumber].attributes["Addr_type"]
    #print("\tMatchtype: "+matchType)
    print(latitude,longitude)

    latitude = x
    longitude = y
    latlong_dict[WO] = str(latitude)+"+"+str(longitude)

else:
    addressString = FullAddress + ", " + city + ", IN"
    print(addressString)
    latlong = ESRIGeolocator(addressString)

 

 

jzcgis
by
New Contributor III

Thank you for the code!

0 Kudos
BobBooth1
Esri Contributor

If the weekly list is 90% the same, maybe it is worth doing a filter step before geocoding, checking the address string in the new list against the existing feature address. If they're the same, keep the feature, if not, remove the feature and geocode the point, then append the result back into your feature class.

jzcgis
by
New Contributor III

I like this idea, thank you.

0 Kudos
jzcgis
by
New Contributor III

Just to double check, when you say filter, do you mean in the tool sense? (https://pro.arcgis.com/en/pro-app/latest/arcpy/classes/filter.htm) or just in code check out those that match the locator addresses and filter out those that do not match the locator addresses?

0 Kudos
BobBooth1
Esri Contributor

I meant add an analytical step to compare the address values of the new features to the old features, and only geocode the points that have changed addresses. 

jzcgis
by
New Contributor III

Thank you, that makes sense!

0 Kudos