Extracting an integer from a string field

86494
10
12-22-2013 04:24 PM
PhilFreeman
New Contributor II
I have an attribute table with a column populated with road names.  The names are in the form, "State Hwy 13", "US Hwy 1", "County Rd 444", etc.  I have created another column in the table which I would like to populate with just the numerical portions of these road names.  I thought this would be easy with the field calculator, but I am finding that to not be the case.  I would like to use Python, but if VBScript would work better, that would be ok as well.  Please keep in mind that I know very little about Python and have used Visual Basic only in Excel macros, so if you post some code, do not leave anything out thinking I will know what is missing.  I am running ArcMap 10.2  Any help will be greatly appreciated.
10 Replies
DanPatterson_Retired
MVP Emeritus
>>> import string
>>> a = "US Hwy 1"
>>> b = string.split(a," ")[-1]
>>> b
'1'
>>>
0 Kudos
PhilFreeman
New Contributor II
Dan,  I appreciate the quick reply, but I guess I didn't state the problem clearly enough.  My examples made it appear that the string always ends with the number I am trying to extract, but that is only the case about 95% of the time.  So for example, if the string reads, "State Hwy 33 W", the script returns "W".  For "State Hwy 29 Bus", it returns "Bus".  Can this script be modified to eliminate those problems?
0 Kudos
XanderBakker
Esri Esteemed Contributor
Hi Phil,

The string.isdigit() can be used to test if it is numeric. Looping through the split list (from back) will result in the first number.

lstRoads = ['US Hwy 1', 'State Hwy 33 W','State Hwy 29 Bus', 'some 1 other string text with a number','yet another test with a 6 number']
for r in lstRoads:
    lstr = r.split(' ')
    for i in range(len(lstr)-1,-1,-1):
        if lstr.isdigit():
            print " - part {0} - in string '{1}' is '{2}' and this is numeric {3}".format(i,r,lstr,float(lstr))
            break


returns:
- part 2 - in string 'US Hwy 1' is '1' and this is numeric 1.0
- part 2 - in string 'State Hwy 33 W' is '33' and this is numeric 33.0
- part 2 - in string 'State Hwy 29 Bus' is '29' and this is numeric 29.0
- part 1 - in string 'some 1 other string text with a number' is '1' and this is numeric 1.0
- part 5 - in string 'yet another test with a 6 number' is '6' and this is numeric 6.0


Please note that if a part contains a number and a character this will not be seen as numeric...

Kind regards,

Xander
0 Kudos
PhilFreeman
New Contributor II
Thank you, Xander.  I am totally new to Python and am finding that it is not as intuitive as I had hoped.  But, I will see if I can apply what you sent to ArcMap.  Maybe I will get ambitious and take one of ESRI's online courses on Python. 

Best wishes.

Phil
0 Kudos
DouglasSands
Occasional Contributor II
Thank you, Xander.  I am totally new to Python and am finding that it is not as intuitive as I had hoped.  But, I will see if I can apply what you sent to ArcMap.  Maybe I will get ambitious and take one of ESRI's online courses on Python. 

Best wishes.

Phil


You could also use regex:


import re
text = 'There is some 234198372 data in here'
print re.findall('\\d+', text)


This would print:
['234198372']


The result will be a list of numeric strings. You could convert these to integers.

Also this post has the following code, which does more or less the same thing without regex:
>>> str = "h3110 23 cat 444.4 rabbit 11 2 dog"
>>> [int(s) for s in str.split() if s.isdigit()]
[23, 11, 2]


with the caveat that:

This will not recognise floats, negative integers, or integers in hexadecimal format.
KristenE
New Contributor III
1. Right click on your new field, to get to the field calculator
2. Click the Python radio button
3. Check the Show Codeblock box
4. In the Pre-Logic Script Code box, copy and paste this function:
def get_num_from_string(string):
    '''This function retrieves numbers from a string and converts them to integers'''
    # Create empty string to store numbers as a string
    num = ''
    # Loop through characters in the string
    for i in string:
        # If one of the characters is a number, add it to the empty string
        if i in '1234567890':
            num+=i
    # Convert the string of numbers to an integer
    integer = int(num)
    return integer

5. Then in the bottom box, type this (but replace ADDRESS, with the name of your address field):
get_num_from_string(!ADDRESS!)


Hint: Choose long integer as your datatype for the new field. Or if you want to keep it a string, make the function return "num" instead of "integer"
DouglasSands
Occasional Contributor II

def get_num_from_string(string):
    '''This function retrieves numbers from a string and converts them to integers'''
    # Create empty string to store numbers as a string
    num = ''
    # Loop through characters in the string
    for i in string:
        # If one of the characters is a number, add it to the empty string
        if i in '1234567890':
            num+=i
    # Convert the string of numbers to an integer
    integer = int(num)
    return integer



But this breaks down if the string looks like:

>>> text = 'Value  = -12.214'
>>> num = get_num_from_string(string)
>>> num
12214


Which flips the sign and multiples the number by n where n is the number of digits to the right of the decimal.

It will also fail for situations in addresses like '123 72nd St.' which would become 12372.
0 Kudos
DanPatterson_Retired
MVP Emeritus

from Phil's 2nd post... where a solution works 95% of the time.. perhaps find a solution for it and then determine whether the remaining 5% has unique characteristics that the first 95% doesn't share... Often a two stage solution is faster and more elegant than a one-pass run.  It would be interesting to see this type of dataset broken down on those lines.  I am sure that people dealing with geocoding all the time, can come up with examples where failure of the one-step solution occurs all the time.

MatsElfström
New Contributor II

This is of no help, but I'd like to mention that MapBasic/MapInfo has had the function Val(str), from the start. And it has been of great use to me. Today I miss it in ArcGIS. Incidently, not even FME or Excel seem to have a straightforward transformer or module to do this. Strange indeed.