how to edit out illegal characters in columns in large tables (>1000 files)

582
12
06-15-2017 11:51 AM
Highlighted
New Contributor

I'm attempting to join two tables.  

The join validation report indicates several fields with illegal characters.  I'm editing a large county file with over 1000 parcels.  How to remove characters such as "-" or "/" for a whole column of data without doing it manually?

Reply
0 Kudos
12 Replies
Highlighted
MVP Esteemed Contributor

how many illegal characters? If you can use the field calculator a simple !YourFieldName!.replace('-','') using a python parser would be a simple approach.  For many replacements a small function (def) could be written

Reply
0 Kudos
Highlighted
Regular Contributor II

To add on to Dan's comment.  If you have a list of special characters to remove you can do it in the field calculater inside a def with something similiar to .....

>>> import string
>>> StringThatNeedsWork = "The - Time ! was well f$$$### spent   don't, you agree? 12345% 1.234"
>>> removethesechars = ['$','#','%','-']
>>> StringThatNeedsWork.translate(None,''.join(removethesechars))


"The  Time ! was well f spent   don't, you agree? 12345 1.234"
‍‍‍‍‍‍‍‍‍‍‍‍‍
Highlighted
Occasional Contributor

If by illegal characters you mean all non-alphanumeric characters, you can use a list comprehension with str.isalnum():

testString = 'foo-bar#baz?qux@127/\\9]'
testStringCleaned = "".join(x for x in testString if x.isalnum()) # foobarbazqux1279‍‍

Found on Stack Overflow - Turn a string into a valid filename?

Reply
0 Kudos
Highlighted
Regular Contributor II

Note that using .isalnum() will strip out significant '." and spaces as well   testString = " North Highway Route 2.1"  ---> "NorthHighwayRoute21"  if that is ok then it is a quick easy way!

Highlighted
Occasional Contributor

Sometimes I do this, and replace and illegal chars or spaces with an underscore:

testStringCleaned = "".join([x if x.isalnum() else "_" for x in testString])# foo_bar_baz_qux_127__9_

 

Highlighted
Regular Contributor II

Ted , What about your answer in same question? Does it work?

https://community.esri.com/message/688777-odd-special-character-in-field-value 

Reply
0 Kudos
Highlighted
Regular Contributor II

Yes that works just fine, however, that solution is run on the database and not gis software, and specifically it removes "ALL" characters except for the ones with ASCII values between 0 and 255.  So spaces and decimal values are not stripped.  In this case, I am assuming the individual wanted to use the field calculate and only remove certain characters.

Highlighted
MVP Esteemed Contributor

In additional to 7.1. string — Common string operations — Python 2.7.13 documentation , 7.2. re — Regular expression operations — Python 2.7.13 documentation can also be used:

>>> StringThatNeedsWork = "The - Time ! was well f$$$### spent   don't, you agree? 12345% 1.234"
>>> re.sub("[$,#,%,-]","", StringThatNeedsWork)
"The  Time ! was well f spent   don't you agree? 12345 1.234"
>>>
Highlighted
Regular Contributor II

I like yours better!   I have a long way to go in learning Python!