I'm using the usaddress library to split single-line addresses into their various components. Below is the functioning script that I'm using for testing:
import arcpy
from arcpy import env
import os
import usaddress
from collections import OrderedDict
# Get input table through parameter
table = arcpy.GetParameterAsText(0)
# Get the field containing the single-line addresses through parameter
infield = arcpy.GetParameterAsText(1)
# List of fields to be updated
fields = ["addr", "addrNum", "stNm", "zip"]
with arcpy.da.UpdateCursor(table, fields) as cursor:
for addr, addrNum, stNm, zip in cursor:
try:
parse = usaddress.tag(addr)[0]
addrNum = parse.get("AddressNumber", "")
stNm = parse.get("StreetName", "")
zip = parse.get("ZipCode", "")
cursor.updateRow([addr, addrNum, stNm, zip])
continue
except Exception:
arcpy.AddMessage("Error")
Here's what the testing input table looks like before:
And after:
My goal is to write the records that error out to a separate table. (Errors include situations where multiple areas of an address have the same label, like "5318 S 86 CT APT 3 APT 412, Omaha, NE 68137" - multiple APT entries, or "PO Box 291 PO BOX 291, Genoa, NE 68640" - multiple PO Box entries.) I've reviewed the examples in the error handling page, but they seem to be very focused on geoprocessing.
Also, this is the first time I've used cursors in this manner, which prevents me from using something like the following:
except Exception:
arcpy.AddMessage(str(row.getValue("OBJECTID")) + " Error")
At least I can't figure it out so far. Any suggestions or bump in the right direction are greatly appreciated.
Thanks
Justin
Solved! Go to Solution.
When the exception occurs, the cursor has already populated addr, addrNum, stNm, zip
. So you already have the address information you can print out or log. If you want, you could print the OIDs out, since they are the unique identifiers, something like:
# List of fields to be updated
fields = ["addr", "addrNum", "stNm", "zip", "OID@"]
with arcpy.da.UpdateCursor(table, fields) as cursor:
for addr, addrNum, stNm, zip, oid in cursor:
try:
parse = usaddress.tag(addr)[0]
addrNum = parse.get("AddressNumber", "")
stNm = parse.get("StreetName", "")
zip = parse.get("ZipCode", "")
cursor.updateRow([addr, addrNum, stNm, zip, oid])
except Exception:
arcpy.AddMessage("Error: {}".format(oid))
One quick question, why the continue statement at the bottom of the try block?
So you have written some code already, which is good. Does the code generate an error itself? Or, does the code give unexpected results? If an error, please post it with the traceback. If unexpected results, what are you seeing vs what you are getting?
I thought I needed the 'continue' in there or it would 'stop' at the first error, but I now realize that's the point of try/except.
There is currently no errors generated, it simply returns a text message of "Error" if an address record isn't parsed. I'd like to get an output of the actual record that cannot be parsed by usaddress.tag.
So with that 'continue' removed I now have the following:
import arcpy
from arcpy import env
import os
import usaddress
from collections import OrderedDict
# Get input table through parameter
table = arcpy.GetParameterAsText(0)
# Get the field containing the single-line addresses through parameter
infield = arcpy.GetParameterAsText(1)
# List of fields to be updated
fields = ["addr", "addrNum", "stNm", "zip"]
with arcpy.da.UpdateCursor(table, fields) as cursor:
for addr, addrNum, stNm, zip in cursor:
try:
parse = usaddress.tag(addr)[0]
addrNum = parse.get("AddressNumber", "")
stNm = parse.get("StreetName", "")
zip = parse.get("ZipCode", "")
cursor.updateRow([addr, addrNum, stNm, zip])
except Exception:
arcpy.AddMessage("Error")
Which returns a message of "Error" for each of the two incorrect address records:
When the exception occurs, the cursor has already populated addr, addrNum, stNm, zip
. So you already have the address information you can print out or log. If you want, you could print the OIDs out, since they are the unique identifiers, something like:
# List of fields to be updated
fields = ["addr", "addrNum", "stNm", "zip", "OID@"]
with arcpy.da.UpdateCursor(table, fields) as cursor:
for addr, addrNum, stNm, zip, oid in cursor:
try:
parse = usaddress.tag(addr)[0]
addrNum = parse.get("AddressNumber", "")
stNm = parse.get("StreetName", "")
zip = parse.get("ZipCode", "")
cursor.updateRow([addr, addrNum, stNm, zip, oid])
except Exception:
arcpy.AddMessage("Error: {}".format(oid))
Thanks Josh - that makes sense and works to return the Object ID of the records in question, which is a great start.
I'm now going to turn my attention to working on getting those records out into another table.
Thanks again
Justin
I spent most of the day on this and didn't get too far, but I have something. I wasn't able to write to a table in the same .gdb as the source address table, but I am writing to a table in a different .gdb. This isn't a huge deal. However, using the code below, as soon as the first exception is found, it stops parsing the addresses and populating fields in the source table ('addresses2'), and then populates all remaining records in the errors table ('errors'). So once it finds an exception it doesn't go back to the try.
import arcpy
from arcpy import env
import os
import usaddress
from collections import OrderedDict
arcpy.env.overwriteOutput = True
# Get input table
table = arcpy.GetParameterAsText(0)
# Get the field containing the single-line addresses
infield = arcpy.GetParameterAsText(1)
# Get the output table for errors
outErrors = arcpy.GetParameterAsText(2)
# List of fields
fields = ["addr", "addrNum", "stNm", "zip", "OID@"]
with arcpy.da.UpdateCursor(table, fields) as cursor:
for addr, addrNum, stNm, zip, oid in cursor:
try:
parse = usaddress.tag(addr)[0]
addrNum = parse.get("AddressNumber", "")
stNm = parse.get("StreetName", "")
zip = parse.get("ZipCode", "")
cursor.updateRow([addr, addrNum, stNm, zip, oid])
except Exception:
cursor = arcpy.da.InsertCursor(outErrors, fields)
for row in (str(oid)):
cursor.insertRow([addr, addrNum, stNm, zip, oid])
arcpy.AddMessage("Error with record: {}".format(oid))
arcpy.AddMessage("Bad Input Address = : {}".format(addr))
Below you can see how record 310 in the addresses2 table is the first exception, which is written to the errors table, but all subsequent records are written as exceptions as well.
I feel I'm close, I'm just not sure where to go from here.
Thanks
Justin
Okay, I did recognize that i can't use the same cursor, so starting a new cursor ('cursor1') within the exception is now only populating the errors table with actual error records, but there are multiple entries for each error.
import arcpy
from arcpy import env
import os
import usaddress
from collections import OrderedDict
arcpy.env.overwriteOutput = True
# Get input table
table = arcpy.GetParameterAsText(0)
# Get the field containing the single-line addresses
infield = arcpy.GetParameterAsText(1)
# Get the output table for errors
outErrors = arcpy.GetParameterAsText(2)
# List of fields
fields = ["addr", "addrNum", "stNm", "zip", "OID@"]
with arcpy.da.UpdateCursor(table, fields) as cursor:
for addr, addrNum, stNm, zip, oid in cursor:
try:
parse = usaddress.tag(addr)[0]
addrNum = parse.get("AddressNumber", "")
stNm = parse.get("StreetName", "")
zip = parse.get("ZipCode", "")
cursor.updateRow([addr, addrNum, stNm, zip, oid])
except Exception:
cursor1 = arcpy.da.InsertCursor(outErrors, fields)
for row in (str(oid)):
cursor1.insertRow([addr, addrNum, stNm, zip, oid])
arcpy.AddMessage("Error with record: {}".format(oid))
arcpy.AddMessage("Bad Input Address = : {}".format(addr))
Need to figure out why it's giving multiple returns now...
Justin
Need to figure out why it's giving multiple returns now...
In the exception block of your code you are inserting the row once for each digit in the oid:
oid = 310 # object id of address causing error
for row in (str(oid)):
print row # this is your insert line
# prints
3
1
0
## try this
except Exception:
cursor1 = arcpy.da.InsertCursor(outErrors, fields)
cursor1.insertRow([addr, addrNum, stNm, zip, oid])
# since this is an insert, oid is probably ignored
# if you want to save the old oid, you will need to use
# a field other than OID@ in the fields list for this value
arcpy.AddMessage("Error with record: {}".format(oid))
arcpy.AddMessage("Bad Input Address = : {}".format(addr))
Thanks Randy - that did the trick, and it makes sense. Thank you for the explanation.
Justin
Justin, do you feel comfortable giving a single person credit for the "Correct" answer or do you want this just marked Assumed Answered?