Select to view content in your preferred language

Strip numbers from certain fields

1890
19
10-30-2018 11:30 AM
TonyAlmeida
MVP Regular Contributor

I need to strip numbers from certain fields. I have the following but it strips all number from the entire table.

Some of these fields have blanks and nulls.

import arcpy

arcpy.env.overwriteOutput = True

table = "C:/temp/EMS_Table"

field = ['Agency', 'fa', 'lz']
for field in [f.name for f in arcpy.Describe(table).fields if f.type == "String"]:  
    exp = "!{0}!.strip('0123456789.- ')"  
    arcpy.management.CalculateField(table, field, exp.format(field), "PYTHON")‍‍‍‍‍‍‍‍‍‍
0 Kudos
19 Replies
JoshuaBixby
MVP Esteemed Contributor

I grab suggestions from StackOverflow quite a bit, I try to include a link to the original SO thread to include in my GeoNet comment.

JoeBorgione
MVP Emeritus

see edit above....

That should just about do it....
DanPatterson_Retired
MVP Emeritus

from this old thread

https://community.esri.com/message/639336-re-how-to-split-variable-length-field-at-first-number-fiel... 

keep numbers

fld = "aaasdfasdfasdf12341asdfadfadsf"

"".join([i for i in fld if i.isdigit() or i == " "]).strip()
'12341'‍‍‍‍

keep text

"".join([i for i in fld if i.isalpha() or i == " "]).strip()

'aaasdfasdfasdfasdfadfadsf'

line 1 becomes your field calculation and you feed 'fld' into it

ie

for fld in fields:

      do stuff with the above in your code

TonyAlmeida
MVP Regular Contributor

After running everything looked fine but there seems to be some "#" symbols in those fields as well, For example Uk05#26. I need to remove those as well.

field = ['Agency', 'fa', 'lz']
for f in field:
      exp = "!{0}!.strip('0123456789.- ')"
      arcpy.management.CalculateField(table, f, exp.format(f), "PYTHON")

0 Kudos
JoeBorgione
MVP Emeritus

That's thing about naming conventions: they don't mean anything if people won't/don't follow them. Once you get your data cleaned up, then automation works really, really well. You can make yourself crazy trying to automate anomalies only to find another one.  And then another.

Sometimes you just have to roll up your sleeves and expect to get your hands dirty...

That should just about do it....
0 Kudos
TonyAlmeida
MVP Regular Contributor

I am receiving this data and in order for me to make it work for me i have to clean this table up but i get what you are saying.

0 Kudos
JoeBorgione
MVP Emeritus

I like summary tables to find the outliers. A pound sign or dash or some other weird character will jump out in a hurry, and typically the count will be fairly low...

That should just about do it....
0 Kudos
DanPatterson_Retired
MVP Emeritus

just add # to your strip( '0123456789.- #') list

0 Kudos
TonyAlmeida
MVP Regular Contributor

I did try that before i posted but for some reason it's only removing the # from two fields not all three. I just wasn't sure if i was missing something. Does it matter if there is spaces? for example "UK 05# 56CCO. So in this example it needs to be UKCCO.That's the only thing i am seeing that is different.

0 Kudos
DanPatterson_Retired
MVP Emeritus

if you look at your replace list, make sure there is a space in it

a = '0123456789.- #'
a0 = list(a)
a0
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '.', '-', ' ', '#']  
Notice the character before # is a space '  '
0 Kudos