Extracting numbers from string using python in ArcGIS Pro

2851
6
Jump to solution
05-31-2019 07:32 AM
KhaledAli1
New Contributor II

I have a string field in the attribute table for a featureclass which has values like: "abc123_def23_987" and "fks243_ufs876_21". I've created another integer field and wish to extract the numbers after the last underscore (_) so that it yields the numbers "987" and "21" from the examples above respectively. As I'm using I'm using ArcGIS Pro. How can I accomplish this using the python in the "Calculate field" within ArcGIS Pro? I would greatly appreciate help on this. Thanks. 

0 Kudos
1 Solution

Accepted Solutions
JoshuaBixby
MVP Esteemed Contributor
>>> s = "abc123_def23_987"
>>> int(s.split("_")[-1])
987
>>>

View solution in original post

6 Replies
JoshuaBixby
MVP Esteemed Contributor
>>> s = "abc123_def23_987"
>>> int(s.split("_")[-1])
987
>>>
KhaledAli1
New Contributor II

Thank you so much for your help. It worked perfectly. I definitely need to Python. Thanks. 

0 Kudos
GFEP
by
New Contributor

In the Calculate Field tool, you can use Regular Expressions.

Parser:

Python

Code Block:

import re
def only_num(numbers):
return re.findall('\d+',numbers)[-1]

Expression:

only_num(!your_field!)

JoshuaBixby
MVP Esteemed Contributor

I like regular expressions, a lot, but I also realize they aren't ideal for simple situations like the OP presented here.  Since using split involves using built-in functionality, i.e., no need to import additional modules, and split is likely 3-4 times faster than regular expression in this case; I think using split is a better approach.

0 Kudos
GFEP
by
New Contributor

Hello Joshua.

But if you try to extract only numbers  in a string that hasn't any "splitting character", your suggestion doesn't work. So, regular expressions can be a very powerfull option. I ran this code today in a geodatabase with 10,000 polygons. It took 1 second to calculate.

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

There has to be some kind of splitting character or characters, otherwise there would be no way to know what differentiates one group of numbers in a string from another.  In the OP's original example and data set, the strings happened to be very structured with a convenient splitting character right before the final group of numbers, which made using str.split simple.  In your example, relying on the decimal digit regex special sequence implies the splitting character is any non-decimal digit, which presents its own challenges if the numbers included commas or periods.

As I mentioned above, I like and use regex all the time, but it can also be overkill when the situation can be handled with basic string methods.

0 Kudos