Select to view content in your preferred language

how to edit out illegal characters in columns in large tables (>1000 files)

2588
12
06-15-2017 11:51 AM
LarryPearson
Deactivated User

I'm attempting to join two tables.  

The join validation report indicates several fields with illegal characters.  I'm editing a large county file with over 1000 parcels.  How to remove characters such as "-" or "/" for a whole column of data without doing it manually?

0 Kudos
12 Replies
DanPatterson_Retired
MVP Emeritus

how many illegal characters? If you can use the field calculator a simple !YourFieldName!.replace('-','') using a python parser would be a simple approach.  For many replacements a small function (def) could be written

0 Kudos
TedKowal
Honored Contributor

To add on to Dan's comment.  If you have a list of special characters to remove you can do it in the field calculater inside a def with something similiar to .....

>>> import string
>>> StringThatNeedsWork = "The - Time ! was well f$$$### spent   don't, you agree? 12345% 1.234"
>>> removethesechars = ['$','#','%','-']
>>> StringThatNeedsWork.translate(None,''.join(removethesechars))


"The  Time ! was well f spent   don't, you agree? 12345 1.234"
‍‍‍‍‍‍‍‍‍‍‍‍‍
mfcallahan
Frequent Contributor

If by illegal characters you mean all non-alphanumeric characters, you can use a list comprehension with str.isalnum():

testString = 'foo-bar#baz?qux@127/\\9]'
testStringCleaned = "".join(x for x in testString if x.isalnum()) # foobarbazqux1279‍‍

Found on Stack Overflow - Turn a string into a valid filename?

0 Kudos
TedKowal
Honored Contributor

Note that using .isalnum() will strip out significant '." and spaces as well   testString = " North Highway Route 2.1"  ---> "NorthHighwayRoute21"  if that is ok then it is a quick easy way!

mfcallahan
Frequent Contributor

Sometimes I do this, and replace and illegal chars or spaces with an underscore:

testStringCleaned = "".join([x if x.isalnum() else "_" for x in testString])# foo_bar_baz_qux_127__9_

 

AbdullahAnter
Honored Contributor

Ted , What about your answer in same question? Does it work?

https://community.esri.com/message/688777-odd-special-character-in-field-value 

0 Kudos
TedKowal
Honored Contributor

Yes that works just fine, however, that solution is run on the database and not gis software, and specifically it removes "ALL" characters except for the ones with ASCII values between 0 and 255.  So spaces and decimal values are not stripped.  In this case, I am assuming the individual wanted to use the field calculate and only remove certain characters.

JoshuaBixby
MVP Esteemed Contributor

In additional to 7.1. string — Common string operations — Python 2.7.13 documentation , 7.2. re — Regular expression operations — Python 2.7.13 documentation can also be used:

>>> StringThatNeedsWork = "The - Time ! was well f$$$### spent   don't, you agree? 12345% 1.234"
>>> re.sub("[$,#,%,-]","", StringThatNeedsWork)
"The  Time ! was well f spent   don't you agree? 12345 1.234"
>>> 
TedKowal
Honored Contributor

I like yours better!   I have a long way to go in learning Python!