Select to view content in your preferred language

A better way to parse an address?

15112
13
Jump to solution
08-12-2015 08:38 AM
JoeBorgione
MVP Emeritus

I looked at the standardize address tool to see if it would do what I need and it works for most of my address data, but there are a number that don't fit the perfect model.( See Error Trapping )

At any rate, holding the rank of Hack Specialist .1 in the Python Legion, I'm running a series of scripts to parse out address components from a single string address in the form of

1234 S Main ST or 1234 E Olive Branch Dr or 1234 S 300 E

The house number, pre-dir and suf-type/suf-dir aren't too bad, but teasing the street itself out is a little more challenging  as it may be multiple words.  What I've come up with is a series of splits and joins that get the job done, but there has got to be a better way.  Any pointers are appreciated.

Here is what I do:

def myStreetName(inString):
  a = inString.split(' ')
  b = a.pop()                    #takes off suf
  c = ' '.join(a)                  # put it back together
  d = c.split(' ')                 #split it back out again
  e = d.pop(0)                 #takes off housenum
  f = ' '.join(d)                  #put it back together again
  g = f.split(' ')                 # split again
  h = g.pop(0)                 # get rid of pre-dir
  street = ' '.join(g)          #leaves just the street
  return street                 #home free

myStreetName(!fullAddress!)

That should just about do it....
13 Replies
JoeBorgione
MVP Emeritus

Right in Darrens' post above. This is how I run it in the attribute calculator:

def myParser(inString):
  splitString = inString.split(' ')       ###splits the address into a list
  a = splitString[0]                         ### list item 0 is the house number          
  b = splitString[1]                         ### list item 1 is the prefix                
  c = ' '.join(splitString[2:-1])         ###  this rerurns the street name
  d = splitString[-1]                       ### this returns the suf dir or street type            
  return a                                     ### depending on what you want to return,
  #return b                                   ### un-comment and re-comment out the
  #return c                                   ### appropriate value(s)
  #return d

It'll look something like this where I am calculating the value of HouseNumber using the field PT_ADD as the input:

That should just about do it....
MichaelHilstrom
New Contributor

Ok, Thanks I will try it in the field calculator…I am a bit inexperienced…I used the field calc. before, but that was over 10 years ago..

I really do appreciate the help

Thanks, Mike

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

Like all other tools, the best address parser is the one that someone else already wrote.  Lately, I have been using usaddress and have been quiet happy with it.  Granted, it doesn't validate addresses, but it also doesn't carry all the bulky overhead of other geocoding services when all I want to do is parse addresses.

TedKowal
Regular Contributor II

There is a lot of good stuff listed above, In the course of my work I occasionally have to parse addresses and I get data address dumps from our tax division which are horribly bad!  So I have been slowly working on a parser to accommodate this data....  for what it is worth ....  (I am still learning python)

0 Kudos