how to edit out illegal characters in columns in large tables (>1000 files)

2093
12
06-15-2017 11:51 AM
LarryPearson
New Contributor

I'm attempting to join two tables.  

The join validation report indicates several fields with illegal characters.  I'm editing a large county file with over 1000 parcels.  How to remove characters such as "-" or "/" for a whole column of data without doing it manually?

0 Kudos
12 Replies
DanPatterson_Retired
MVP Emeritus

how many illegal characters? If you can use the field calculator a simple !YourFieldName!.replace('-','') using a python parser would be a simple approach.  For many replacements a small function (def) could be written

0 Kudos
TedKowal
Occasional Contributor III

To add on to Dan's comment.  If you have a list of special characters to remove you can do it in the field calculater inside a def with something similiar to .....

>>> import string
>>> StringThatNeedsWork = "The - Time ! was well f$$$### spent   don't, you agree? 12345% 1.234"
>>> removethesechars = ['$','#','%','-']
>>> StringThatNeedsWork.translate(None,''.join(removethesechars))


"The  Time ! was well f spent   don't, you agree? 12345 1.234"
‍‍‍‍‍‍‍‍‍‍‍‍‍
mfcallahan
Occasional Contributor II

If by illegal characters you mean all non-alphanumeric characters, you can use a list comprehension with str.isalnum():

testString = 'foo-bar#baz?qux@127/\\9]'
testStringCleaned = "".join(x for x in testString if x.isalnum()) # foobarbazqux1279‍‍

Found on Stack Overflow - Turn a string into a valid filename?

0 Kudos
TedKowal
Occasional Contributor III

Note that using .isalnum() will strip out significant '." and spaces as well   testString = " North Highway Route 2.1"  ---> "NorthHighwayRoute21"  if that is ok then it is a quick easy way!

mfcallahan
Occasional Contributor II

Sometimes I do this, and replace and illegal chars or spaces with an underscore:

testStringCleaned = "".join([x if x.isalnum() else "_" for x in testString])# foo_bar_baz_qux_127__9_

 

AbdullahAnter
Occasional Contributor III

Ted , What about your answer in same question? Does it work?

https://community.esri.com/message/688777-odd-special-character-in-field-value 

0 Kudos
TedKowal
Occasional Contributor III

Yes that works just fine, however, that solution is run on the database and not gis software, and specifically it removes "ALL" characters except for the ones with ASCII values between 0 and 255.  So spaces and decimal values are not stripped.  In this case, I am assuming the individual wanted to use the field calculate and only remove certain characters.

JoshuaBixby
MVP Esteemed Contributor

In additional to 7.1. string — Common string operations — Python 2.7.13 documentation , 7.2. re — Regular expression operations — Python 2.7.13 documentation can also be used:

>>> StringThatNeedsWork = "The - Time ! was well f$$$### spent   don't, you agree? 12345% 1.234"
>>> re.sub("[$,#,%,-]","", StringThatNeedsWork)
"The  Time ! was well f spent   don't you agree? 12345 1.234"
>>> 
TedKowal
Occasional Contributor III

I like yours better!   I have a long way to go in learning Python!