Select to view content in your preferred language

Remove letters and special characters from an alpha numeric string using python

33084
11
Jump to solution
11-15-2012 03:01 PM
JessicaKirby
Deactivated User
Hi All,
I need to find a way to remove all letters and special characters from a string so that all i am left with is numbers using python.
The end goal is to use this code in the python code block in the Calculate Field GP tool.

I have a field called Lease_Num
It contains a string such as ML-26588 , ML 25899, UTLM58778A
So sometimes it has a dash, sometimes it has a space and sometime it has nothing.  The Alpha characters can be any length and the numbers can be any length.  Sometime their can even be an alpha character at the end of the string...its totally random

I need to strip out the the alpha's and any special characters or spaces from the string to leave me with just the number.

Can someone help me with how I would do this and then tell me how to apply it to the code block on Calculate Field GP tool.

Many thanks!
Tags (2)
1 Solution

Accepted Solutions
MikeHunter
Frequent Contributor
A simple 1 liner will do it:
cleanedValue =  ''.join([i for i in fieldValue if i.isdigit()])



good luck,
Mike

View solution in original post

11 Replies
ChrisSnyder
Honored Contributor
Soemthing like this would work:

import string
fieldValue = "ML-26588"
for character in fieldValue:
    if character in string.ascii_letters or character in string.punctuation:
        fieldValue = fieldValue.replace(character, "")
print fieldValue


I think it would be better to use something like this in an update cursor though... Somthing like:

#Removes all letters and special characters from a string - hopefully leaving only numbers.
import string, arcpy
updateRows = arcpy.UpdateCursor(myFC)
for updateRow in updateRows:
   fieldValue = updateRow.MY_FIELD
   for character in fieldValue:
      if character in string.ascii_letters or character in string.punctuation:
         fieldValue = fieldValue.replace(character, "")
   updateRows.updateRow(updateRow)
del updateRow, updateRows
0 Kudos
Luke_Pinner
MVP Regular Contributor
There's always more than one way do something 🙂

Here's another
import string
fieldValue = "ML-26588-12-a"
stripChars = fieldValue.translate(None, string.digits)
fieldValue = fieldValue.translate(None, stripChars)
print stripChars
print fieldValue
prints:
ML---a
2658812


And even more 🙂
0 Kudos
ChrisSnyder
Honored Contributor
.translate - I like that!
0 Kudos
MikeHunter
Frequent Contributor
A simple 1 liner will do it:
cleanedValue =  ''.join([i for i in fieldValue if i.isdigit()])



good luck,
Mike
EricMartinson
Occasional Contributor

Thanks, Mike! It was an agonizing quest before I found your simple answer!

Eric

0 Kudos
EricMartinson
Occasional Contributor

Mike, or anyone, are you still out there? Your solution worked perfectly for me in a field calculation a couple days ago. Now, using the same fields in the same database, but a different selection of records, I'm getting an error. I'm too much of a dolt in Python to see why. Here's what I'm getting:

"syntaxerror: encoding declaration in unicode string (<expression>, line 0)" (see below)

At first I thought it might be some characters that were causing the issue (hyphens, slashes, spaces), but eliminating them didn't help.

Any ideas?

Eric

Python field calculation Unicode syntax error

0 Kudos
curtvprice
MVP Alum
... and then tell me how to apply it to the code block on Calculate Field GP tool.


Expression:
ExtractNum(!INFIELD!)

Code block:
def ExtractNum(fieldValue):
  try:
    val = ''.join([i for i in fieldValue if i.isdigit()])
  except:
    val = None
  return val
JessicaKirby
Deactivated User
A simple 1 liner will do it:
cleanedValue =  ''.join([i for i in fieldValue if i.isdigit()])



good luck,
Mike


Thank you Mike, this worked like a charm!
0 Kudos
shivamparashari
Deactivated User
Hi,

I am stuck in the same question and trying to remove the alpha characters from the "House Number" field. The house number can be like - 62/A, A/62, 62A/1 etc. Now, I just want to remove the Alpha Characters only(which can be at any place) and nothing else. Can anyone help me with the code (Python or SQL). 😞
0 Kudos