Hello,
I am trying to extract the largest number from a string.
More details:
I have a feature class with a text field. Each feature has a string like this: "3, 5, 3.9, 2.5 FAR Values" and I need to extract the higher number and put it in a new field. From that string I would need number 5.
There are some features that have Null values and some with just text and no numbers.
I wrote the following script using a python function from the internet but I am not sure how to apply it in Arcpy.
import arcpy
# find largest number in a string
arcpy.env.workspace = r"D:\APRX_MXDS\USA_App_Project\usa_parcels_with_FARField.gdb"
arcpy.env.overwriteOutput = True
fc = "temp"
with arcpy.da.UpdateCursor(fc, "FAR_INTEGER") as cursor: # Loop through each feature
for row in cursor:
ls = list()
for w in row[0].split():
try:
ls.append(int(w))
except:
pass
try:
return max(ls)
except:
return None
For extracting numbers from text, you are going to want to use regular expressions instead of Python string split, unless your text strings are highly structured and simple. I would just err on the side of using re — Regular expression operations — Python 3.8.3 documentation .
Assuming you created a new field "MAX_VALUE" to hold the maximum value, the following code should work for you:
import arcpy
import re
# find largest number in a string
arcpy.env.workspace = r"D:\APRX_MXDS\USA_App_Project\usa_parcels_with_FARField.gdb"
arcpy.env.overwriteOutput = True
fc = "temp"
with arcpy.da.UpdateCursor(fc, ["FAR_INTEGER", "MAX_VALUE"]) as cursor: # Loop through each feature
for row in cursor:
if row[0] is None: continue
nbrs = [float(i) for i in re.findall('(\d\.?\d*|\.\d*)', row[0])]
if nbrs:
row[1] = max(nbrs)
cursor.updateRow(row)
UPDATE: Made a change to code to account for strings that don't have any numbers.
Trying testing your regex:
"3, 5, 3.9, 12.5 FAR Values"
"3, 5, 3.9, 2.5 FAR Values."
This is a good case for using regex and a much better solution than the one I provided, but using regex does not mean you shouldn't check for exceptions in type conversions:
import arcpy
import re
# find largest number in a string
arcpy.env.workspace = r"D:\APRX_MXDS\USA_App_Project\usa_parcels_with_FARField.gdb"
arcpy.env.overwriteOutput = True
fc = "temp"
with arcpy.da.UpdateCursor(fc, ["FAR_INTEGER", "MAX_VALUE"]) as cursor: # Loop through each feature
for row in cursor:
if row[0] is None: continue
nbrs = []
for i in re.findall('(\d+\.?\d*|\d*\.\d+)', row[0]):
try:
nbrs.append(float(i))
except:
pass
if nbrs:
row[1] = max(nbrs)
cursor.updateRow(row) # if FAR_INTEGER contained no numbers, no need to update
That's a slightly improved regex, but I still wouldn't trust it to pass up every erroneous input. You could make an argument to wrap the list assignment in the try statement. But if the regex matches one erroneous input mixed in with some valid inputs, you would get no max value.
First, there are a number of problems your code:
Resulting in something like:
import arcpy
# find largest number in a string
arcpy.env.workspace = r"D:\APRX_MXDS\USA_App_Project\usa_parcels_with_FARField.gdb"
arcpy.env.overwriteOutput = True
fc = "temp"
with arcpy.da.UpdateCursor(fc, ["FAR_INTEGER", "max_far"]) as cursor:
for row in cursor:
ls = []
for w in row[0].replace(',', '').split():
try:
ls.append(float(w))
except:
pass
try:
row[1] = max(ls)
cursor.updateRow(row)
except:
pass
I didn't test that, so I might have introduced more bugs.
Now, executing that script is a different topic. What program are you using? I'm most familiar with ArcGIS Pro, in which you should be able to create a new Jupyter notebook (Insert Tab -> New Notebook) and copy-paste it in.
thank you so much for your replies. I will test all these and will let you know how it went