Select to view content in your preferred language

Data Interop Ext - Parsing a String Name

997
2
07-30-2010 05:35 AM
NelsonDe_Miranda
Frequent Contributor
We are building a series of spatial ETL tools to help our clean up some of the data we have been receiving. I have run into a problem where each road segment name takes the following form:

"1500 W Saint George St"
"1400 W Saint George St"
"1300 W Saint George St"

Using the string seacher I am able to identify all the names that begin with a numeric character using ^[0-9]. The problem is when the matched names are retured (those starting with a numeric value), I am unable to retain the later portion of the string from the end of the numeric variable forward.

In addition to this, the numbers are not always in the same format, for example some street names are listed as such:

"-1 George St."
"500-600 George St"
"Nanaimo Ave"

My idea is to combine a series of string searchers to ensure that I capture all the variables that begin with symbols or numbers and then use the space following those features to seperate the name out.

Unfourtunately I have been unsucessfull in doing so.

Thanks in advance,

Nelson
0 Kudos
2 Replies
BruceHarold
Esri Regular Contributor
Hi Nelson

Welcome to the arcane world of regular expressions!
You are going to need to build a more complex regular expression definition to pick up the address components.  For example this pattern parses the case "500-600 George St":

([0-9]+)(-*)([0-9]*) ([a-z ][A-Z ]+)

You will then need to grab the parts from the resulting matched_parts list:

`_matched_parts{0}' has value `500'
`_matched_parts{1}' has value `-'
`_matched_parts{2}' has value `600'
`_matched_parts{3}' has value `George St'

Regards
0 Kudos
NelsonDe_Miranda
Frequent Contributor
Perfect!

I tried some regular expressions and couldn't get them to return what I wanted. Now I see what I was doing wrong.

Thanks!

- Nelson
0 Kudos