Address filed split into fields

2113
7
07-28-2017 08:59 AM
CCWeedcontrol
Occasional Contributor III

I have a table with about 6k road names. only the FULLSTNAME field is populated. I need to populate the PERDIR (pre directional) STNAME (Street name only), STTYPE (Street type) and POSTDIR (post directional) from the field FULLSTNAME.

To start off i started with field calculator but only a handful of fields get populated.

def myParser(inString):
  splitString = inString.split(' ')       ###splits the address into a list
  a = splitString[0]                         ### list item 0 is the first part of name          
  b = splitString[1]                         ### list item 1 is the prefix                
  c = ' '.join(splitString[2:-1])         ###  this rerurns the street name
  d = splitString[-1]                       ### this returns the suf dir or street type            
  return a                                     ### depending on what you want to return,
  #return b                                   ### un-comment and re-comment out the
  #return c                                   ### appropriate value(s)
  #return d

myParser(!FULLSTNAME!)

Even looking at just populating "PREDIR" field with the above filed calculator i would run into problems, for example the road name "New Castle Dr" it would put "New" in the PREDIR field which correct because it is not a Pre- directional. Some roads have two words for names like "New Castle Dr" has "New Castle". if used return c it would only populate the STNAME field with just "New".

I tried looking at using arpy with the following but get an invalid syntax line 15

import arcpy

from arcpy import env
inputTable = "Road_Names_1"
arcpy.AddMessage ("input " + inputTable)

#split field loop
rows = arcpy.UpdateCursor (inputTable)  #gets all vlaues for table

for row in rows:  #loops through vlaues and updates new field 
    row.Type = row.FULLSTNAME.split(" ")[-1] if ',' in !FULLSTNAME! then !STNAME!=!FULLSTNAME!.split(',',1)[-1] else ""
    rows.updateRow(row)

del row
del rows

I also tried to look at at a post by Ted Kowal of his Address Parase but i am not sure how to use it.

If anyone already has a python script that works and would be willing to share i would really appreciate it.

Thanks.

0 Kudos
7 Replies
JoshuaBixby
MVP Esteemed Contributor

Just say "no" to rolling your own address parser.  There is no way you can make something that will work better than some of the Python address parsers that already exist.  There are several available, just do an internet search.  Personally, I use GitHub - datamade/usaddress: a python library for parsing unstructured address strings into address ...  quite a bit.

CCWeedcontrol
Occasional Contributor III

Joshua,

thank you for the replay and link to the address parse. So an Address parse breaks up a single address strings into individual parts, which i get but i am not sure how use it to accomplish what i am trying to do.

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

Did you read the docs for usaddress, try something, and it didn't work?  What part don't you understand, the parsing of the addressing, updating/populating the fields using a cursor?

Using usaddress and a couple of addresses from your screenshot:

>>> import usaddress
>>> for k,v in usaddress.tag('N Windsor Park Ct')[0].items():
...     print k, v
...     
StreetNamePreDirectional N
StreetName Windsor Park
StreetNamePostType Ct
>>> for k,v in usaddress.tag('Noahs Landing Cir')[0].items():
...     print k, v
...     
StreetName Noahs Landing
StreetNamePostType Cir
>>> 
CCWeedcontrol
Occasional Contributor III

thanks for the reply yes i did read the docs my apologies for not understand how to use the parser .

I am not quit sure how updating/populating the fields using a cursor.

This is what i have came up with  but i am not sure if i am using usaddress correctly. I am getting an error on

row[1] = addr_dict[row[4]]
TypeError: tuple indices must be integers, not unicode

import arcpy, usaddress
from collections import OrderedDict



# Get feature class through parameter
Table = "RoadNameTabl"

field1 = "PREDIR"
field2 = "STNAME"
field3 = "STTYPE"
field4 = "POSTDIR"
field5 = "FULLSTNAME"
'''
fields = ['PREDIR','STNAME','STTYPE','POSTDIR','FULLSTNAME']


fields = (field1,field2,field3,field4,field5)
with arcpy.da.UpdateCursor(Table,fields) as cursor:
    for row in cursor:
        field2 = row[fields.index(field5)]
'''

with arcpy.da.UpdateCursor(Table,(field1, field2, field3, field4, field5)) as cursor:
    for row in cursor:
        addr_dict = usaddress.tag(row[4])

        # say you want to put the address name in field 2
        row[1] = addr_dict[row[4]]
        cursor.updateRow(row)

del row

print 'Done'
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍
0 Kudos
JoshuaBixby
MVP Esteemed Contributor

You just need to spend some time exploring the data structure of the output from usaddress.tag.  Tagging returns a tuple with an ordered dict and a string classifying the address.

Using your code example, I think the following is what you are after:

import arcpy, usaddress
from collections import OrderedDict



# Get feature class through parameter
Table = "RoadNameTabl"

fields = ['PREDIR','STNAME','STTYPE','POSTDIR','FULLSTNAME']

with arcpy.da.UpdateCursor(Table,fields) as cursor:
    for predir, stname, sttype, postdir, fullstname in cursor:
        parse = usaddress.tag(fullstname)[0]
        predir = parse.get("StreetNamePreDirectional", "")
        stname = parse.get("StreetName", "")
        sttype = parse.get("StreetNamePostType", "")
        postdir = parse.get("StreetNamePostDirectional", "")
        cur.updateRow([predir, stname, sttype, postdir, fullstname])
CCWeedcontrol
Occasional Contributor III

ok, i am see it now. Thank you very much for your help.

Although i am getting an error and i am not sure why. Is it not liking the actual values( Ave, Rd, Ct, Dr, St, Way, Ln, Blvd, Pl,Cir, Trl) and if so how can tell which one it doesn't like?

line 17, in <module>

cursor.updateRow([predir, stname, sttype, postdir, fullstname])
RuntimeError: Invalid column value [STTYPE]

import arcpy, usaddress
from collections import OrderedDict

# Get feature class through parameter
Table = "RoadNameTabl"

fields = ['PREDIR','STNAME','STTYPE','POSTDIR','FULLSTNAME']

try:
with arcpy.da.UpdateCursor(Table,fields) as cursor:
    for predir, stname, sttype, postdir, fullstname in cursor:
        parse = usaddress.tag(fullstname)[0]
        predir = parse.get("StreetNamePreDirectional", "")
        stname = parse.get("StreetName", "")
        sttype = parse.get("StreetNamePostType", "")
        postdir = parse.get("StreetNamePostDirectional", "")
        cursor.updateRow([predir, stname, sttype, postdir, fullstname])

print 'Done'
0 Kudos
TedKowal
Occasional Contributor III

When I wrote those parsing routines, there were no freely available parsers in existence.  Secondly, I was not necessarily interested in parsing the addresses but to "cleanup" some very bad data I received from our tax authority database.  Lastly, I cannot claim credit for all the routines -- they are a collection of snippets from lots of python folks...  I took a look at the usaddress parser.  Sorta funny,  I will be getting a 5 Million record dump from the same tax authority shortly and I intend to use the parser suggested by Joshua Bixby this time around.

Addresses are very messy as you are finding out ... best to use something that has already been developed!  (And is free!).  Here in Miami, I have found out that the US Postal system aka address does not always use the official street names and nomenclature so I had to interject a lot of exceptions to clean up the data for my use to relate the addresses to our inventory of roads.

BTW: I am attaching a cal file which I used during the cleanup.  The cal file imports the script you mentioned and returns the StreetNumber in this example.  But in my case I am going the opposite direction you are... I have a bunch of addresses for which I am trying to find a related Roads Inventory street for.....

0 Kudos