bokeefe

Field Calculator : Strip "STRING" from within "STRING" (or how to spend hours doing something simple but in a difficult way)

Blog Post created by bokeefe on Mar 18, 2015

EDIT/UPDATE:

So apparently I've been TOO close to this whole project from the start. I completely forgot about the function Replace().

 

Public Function Replace(

  ByVal Expression As String,

  ByVal Find As String,

  ByVal Replacement As String,

  Optional ByVal Start As Integer = 1,

  Optional ByVal Count As Integer = -1,

  Optional ByVal Compare As CompareMethod = CompareMethod.Binary

) As String

 

I could have just used the following: NewEntry = Replace( [Address], "1/2", "" )

 

So now for my next trick, I will create a 3Gb script for capitalizing your name.


 

I am working with addresses. If you've done this yet, then you probably feel my pain. Everyone and their brother has an opinion on an address. Regardless of whether that address has been OFFICIALLY ASSIGNED by the ADDRESS COORDINATOR for that municipality or not.

 

I know this because I was, previous to my current GIS Analyst position, the "Address Coordinator for the City of Tulsa, Oklahoma."

 

And it didn't matter WHAT I said, someone else ALWAYS thought they knew the better answer. The funny thing was that sometimes there were MULTIPLE correct answers.

 

For instance, if we are dealing with an address of 1234-1/2 South Main Street East there are a variety of ways you can USE this address.

 

If there had previously been a building located at 1234 and one at 1234-1/2 but the main one had been demolished, then the post office might not care if you used 1234 or 1234-1/2. If the utility company you are on the phone ONLY had an entry for the 1234 and somehow 1234-1/2 had been set up on the 1234 old account, then they might NEVER care that officially 1234-1/2 is correct. Addresses can be... I hate to admit it... somewhat subjective.

 

Skip to today. I'm trying to clean up 200k+ addresses for a mass import into our Local Government Model in our Enterprise GIS system. I have hit a situation where 5k-10k address have the "1/2" sub-designator on them. We no longer USE this sub-designator. We have changed over to using a sub-designator of "REAR" much like "APT" is used. So "1234-1/2 S MAIN ST E" would now read "1234 S MAIN ST E, REAR" instead.

 

Now, some of these addresses might technically have not undergone the transformation that was supposed to happen roughly 4 years ago when I was still the "Address Coordinator" and they may legitimately be using an incorrect "1/2" but for all intents and purposes, I'm leaving that problem to the CURRENT "Address Coordinator" and trying to get the "1/2" out of ALL of our addresses.

 

What this has led to is a problem. I was able to use the toolbox tool ArcToolbox > Geocoding Tools > Standardize Addresses and break the various components out as needed. But these pesky "1/2" are still littering up my addresses. So I shifted over to the Field Calculator. I started hitting problems, but eventually settled on a solution that has worked wonders for me, and might be usable in other situations.

 

I had roughly 10 addresses where the "1/2" designator was at the END of the address. I manually removed those. But I'm still dealing with 5k-10k addresses containing "1/2" where there should be only silence. So... after some to-do, I figured out a solution. I wanted to share my code.

 

If you just need to see the code, here it is...

DIM StripStart, StripEnd
DIM DataToStrip, DataToStripRight, DataToStripLeft, DataToStripLen
DIM WhatToStrip, WhatToStripLen
DIM NewDataSet

WhatToStrip = "1/2"
WhatToStripLen = Len( WhatToStrip )

DataToStrip = [Address]
DataToStripLen = Len( DataToStrip )

StripStart = InStr( 1, DataToStrip, WhatToStrip )
If StripStart = 0 Then
     NewDataSet = DataToStrip
     Else
     StripEnd = StripStart + WhatToStripLen
     DataToStripLeft = Left( DataToStrip, ( StripStart - 1 ) )
     DataToStripRight = Right( DataToStrip, ( DataToStripLen - StripEnd ) )
     NewDataSet = DataToStripLeft + DataToStripRight
End If

There you go.

 

However, I recognize that not everyone understands Visual Basic, much less Visual Basic in relation to ArcGIS. So here is a COMMENTED OUT version of the same code for those who like to learn as they use.

DIM StripStart, StripEnd
DIM DataToStrip, DataToStripRight, DataToStripLeft, DataToStripLen
DIM WhatToStrip, WhatToStripLen
DIM NewDataSet

' Field Calculator VB Script Snippet
' Purpose: To strip targeted text from a text field
' Sample "DataToStrip" example = [1003 1/2 N EVANSTON AVE]
' Sample "WhatToStrip" example = [1/2]

' First we assign the STRING that we want to strip from our target
WhatToStrip = "1/2"

' We need to get the LENgth of this string, which in this case = 3
WhatToStripLen = Len( WhatToStrip )

' Our next step is to assign the targeted STRING data that we
' want to strip our "WhatToStrip" from. In this instance we
' are targeting the [Address] field.
DataToStrip = [Address]

' We need to get the LENgth of the "DataToStrip", which in
' this example = 23
DataToStripLen = Len( DataToStrip )

' Now we need to find the point at which to BEGIN our targeted
' removal. We do this by finding where in the "DataToStrip"
' using the VB Command InStr( ) to find where "WhatToStrip" is
' located. InStr() command returns the "number of characters from
' the beginning of the string where our targeted "WhatToStrip"
' can be found.

' Breakdown / Explanation:
' InStr( 1, DataToStrip, WhatToStrip )
' For our example above we are passing the command InStr three
' separate variables.
' 1) We are passing it the #1
' 2) We are passing it OUR variable called "DataToStrip" which
'    is the address [1003 1/2 N EVANSTON AVE]
' 3) We are passing it OUR variable called "WhatToStrip" which
'    is the STRING containing "1/2"
' What this does is tell InStr to find at what point the
' STRING we have assigned to "WhatToStrip" can be found inside
' of the STRING we have assigned "DataToStrip" if we start
' counting at the "1" character of "DataToStrip".
' So in our example, we are then counting from the number "1"
' in [1003 1/2 N EVANSTON AVE] to the first time that "1/2"
' appears which is the 6th character.
' So to sum it all up... if we starting counting at the first
' character in [1003 1/2 N EVANSTON AVE] until we hit the first
' occurence of the STRING "1/2" then we have counted to "6"
' So, we are then assigning the number 6 to our variable called
' "StripStart" so we can use it in our code.
StripStart = InStr( 1, DataToStrip, WhatToStrip )

' Now, since we may need to use this code sometimes in instances
' where we don't KNOW if our "WhatToStrip" actually EXISTS inside
' of the "DataToStrip" we need to do a check. We do this with
' an IF / THEN statement.
' The following IF / THEN statement says...

' If StripStart = 0
' Or, in english... if our previous command InStr didn't find
' the "WhatToStrip" inside of our "DataToStrip" then "StripStart"
' would now equal ZERO or 0
If StripStart = 0 Then

           ' IF "StripStart" equals ZERO / 0 then we are done. We
     ' don't need to do anything else with this data. Our code
     ' is finished...
     NewDataSet = DataToStrip
     Else

           ' Else
     ' Or, in english... If "StripStart" DOESN'T equal ZERO / 0
     ' Then we can move ahead.

     ' So we have deduced that our "DataToStrip" does IN FACT
     ' contain the "WhatToStrip" so we need to get some more
     ' information in order to complete our block of code.

     ' We need to take "StripStart" (which is now 6 in our example)
     ' and find the point inside of "DataToStrip" that the STRING
     ' we assigned to "WhatToStrip" ends. So we simply take
     ' "StripStart" and add the LENgth of our "WhatToStrip" to get
     ' our "StripEnd". This means that we would take 6 and add it
     ' to 3 and get 9. Or in english, we now know that our
     ' "WhatToStrip" exists inside of our "DataToStrip" and it
     ' starts at character "StripStart" (or 6) and it ends at
     ' character "StripEnd" (or 9).
     StripEnd = StripStart + WhatToStripLen

     ' It's time to take apart our "DataToStrip" and break it into
     ' two separate pieces.

           ' One step is to take ALL of the STRING to the LEFT of
     ' our "WhatToStrip"... We do this with the Left( ) command.

     ' This command takes two variables.
     ' Left ( DataToStrip, ( StripStart - 1 ) )
     ' 1) "DataToStrip" tells the Left() command that we are stripping
     '    the LEFT side of whatever is assigned to this variable.
     ' 2) ( StripStart - 1 ) tells the Left() command to take
     '    "StripStart" and subtract 1 to get the number of
     '    characters to strip from the LEFT side.
     ' Or in english, we need to take "DataToStrip" which has been
     ' assigned the STRING [1003 1/2 N EVANSTON AVE] and CUT OUT
     ' the first 5 characters ("StripStart" - 1 = 5) and assign
     ' them to our variable "DataToStripLeft".

     ' So once this line is executed, in our example, "DataToStripLeft"
     ' will now contain [1003 ]. Notice the space after the 1003.
     ' Spaces are considered a character. It's also important to note
     ' that the space is the 5th character if you count from the
     ' number 1, in 1003, and count TO THE RIGHT... or from the Left().
     DataToStripLeft = Left( DataToStrip, ( StripStart - 1 ) )

     ' Our next step is to take ALL of the STRING to the RIGHT of
     ' our "WhatToStrip"... we do this with the Right( ) command.

     ' This command is just like Left( ) and takes two variables.
     ' Right ( DataToStrip, ( DataToStripLen - StripEnd ) )
     ' 1) "DataToStrip" tells the Right() command that we are stripping
     '    the RIGHT side of whatever is assigned to this variable.
     ' 2) ( DataToStripLen - StripEnd ) tells the Right() command to
     '    take "DataToStripLen" and subtract the value assigned
     '    to "StripEnd" to get the number of characters to strip from
     '    the RIGHT side.
     ' Or in english, we need to take "DataToStrip" which has been
     ' assigned the STRING [1003 1/2 N EVANSTON AVE] and CUT OUT
     ' the last 14 characters ("DataToStripLen" - "StripEnd" or 23 - 9 = 14)
     ' and assign them to our variable "DataToStripRight".

     ' So once this line is executed, in our example, "DataToStripRight"
     ' will now contain [N EVANSTON AVE]. It's important to note that
     ' the letter "N" in this instance is the 14th character if you count
     ' from the "E" in "AVE" and count TO THE LEFT... or from the Right().
     DataToStripRight = Right( DataToStrip, ( DataToStripLen - StripEnd ) )

     ' So now we have "DataToStripLeft" assigned the STRING [1003 ] and
     ' we have "DataToStripRight" assigned the STRING [N EVANSTON AVE].
     ' With a simple concantenation we can create our final STRING minus
     ' the "WhatToStrip" in one fell swoop.

     ' We take the variable "NewDataSet" and assigned it two strings
     ' connected together.
     ' "DataToStripLeft" = [1003 ]
     ' "DataToStripRight" = [N EVANSTON AVE]
     ' If we piece them together, we get [1003 N EVANSTON AVE] and our
     ' pesky "1/2" is completely gone.
     NewDataSet = DataToStripLeft + DataToStripRight
End If

So enjoy, comment, correct or improve as you see fit. The biggest problem for this piece of code is if the 1/2 shows up at the end of the address. In MY situation, this only happened for roughly 10-20 so I manually corrected these. This code could be adjusted to cover those scenarios tho.

 

Well, that's all I have for this situation. I hope this helps someone.

 

Peace, Love and Basically VIsual

Outcomes