Unicode error

AmyKlug · ‎07-29-2015

Hi,

When i am running this code i am getting a Unicode error half way through. Not sure what mxd/layer file name is hanging it up (line 33) on but not sure where to put the fix (line 7 - or if this is the correct fix) either.

import arcpy, os

#code adds mxd name and layer path name to text file separated by a comma
arcpy.env.overwriteOutput = True


#def Utf8EncodeArray(oldArray):
    #newArray = []
    #for element in oldArray:
        #if isinstance(element, unicode):
        #newArray.append(element.encode("utf-8"))
    #else:
        #newArray.append(element)
    #return newArray


path = "////serverpath"
#path2 =
mxdlst = []
txt = open("text file path", 'w')
print "making mxd list"
for root, dirs, files in os.walk(path):
    for fname in files:
        if fname.endswith(".mxd"):
            mxd = root + '\\' + fname
            mxdlst.append(mxd)
del mxd, fname
for mapdoc in mxdlst:
    mxd = arcpy.mapping.MapDocument(mapdoc)
    for df in arcpy.mapping.ListDataFrames(mxd, "*"):
        for lyrlst in arcpy.mapping.ListLayers(mxd, "*", df):
            if lyrlst.supports("DATASOURCE"):
                txt.write(mapdoc + "," + lyrlst.workspacePath + "\\" + lyrlst.name + "\n")
                print "adding" + mapdoc + "," + lyrlst.workspacePath + "\\" + lyrlst.name + "\n"
            else:
                txt.write(mapdoc + "," + lyrlst.name + "\n")
                print "adding" + mapdoc + "," + lyrlst.name + "\n"
txt.close()
del mxd, df, lyrlst, mapdoc, mxdlst

DanPatterson_Retired · ‎07-29-2015

What is the layer name etc? If it contains characters that need to be converted, then you will have to do so

AmyKlug · ‎07-29-2015

I need code to check for that and fix it just not sure where to put it

DanPatterson_Retired · ‎07-29-2015

haven't had the unicode issues yet, but apparently, I will have to and one suggestion is to specify encoding at the top of the script with

# -*- coding: utf-8 -*-

but some unicode types should step up since it appears that you have a character that can't be represented by the ASCII chars in the range 0-127

AmyKlug · ‎07-29-2015

I have seen that before. does it work with the # sign in front?

DanPatterson_Retired · ‎07-29-2015

apparently that is what is supposed be done, first line. but I seriously haven't played with encoding other than ascii ... I really should given accented characters etc, but so far, haven't had to deal with. My only suggestion is find someone that works with such data and or look at at one of the files and see what characters are present there. Sorry I can't help more, but searching on your error message on GeoNet may turn up more

XanderBakker · ‎07-29-2015

An interesting article to read would be:

Solving Unicode Problems in Python 2.7 | Azavea Labs

And to give an example of what works and fails:

# -*- coding: utf-8 -*-
myText = u"example of únì¢ødë"

# this works:
print myText

# UnicodeEncodeError: 'ascii' codec can't encode character u'\xfa' in position 11: ordinal not in range(128)
print "{0}".format(myText)
# inserting unicode in a str

# This works 
print u"{0}".format(myText)
# inserting unicode into a unicode

# UnicodeEncodeError: 'ascii' codec can't encode character u'\xfa' in position 11: ordinal not in range(128)
print u"{0}".format(myText.decode('utf-8'))
# inserting utf-8 into unicode

DanPatterson_Retired · ‎07-30-2015

what a kludge...

# -*- coding: utf-8 -*-
chars = [unichr(i) for i in range(0,256) if (32 < i < 128) or (i > 161)]
print "Unicode characters 33-127 and 161-255\n" + ("{:5}"*len(chars)).format(*chars)

Unicode characters 33-127 and 161-255

! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ® ¯ ° ± ² ³ ´ µ ¶ · ¸ ¹ º » ¼ ½ ¾ ¿ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ

anything not visible, isn't...can't wait to learn all of them

Luke_Pinner · ‎07-30-2015

Dan Patterson:
...one suggestion is to specify encoding at the top of the script with # -*- coding: utf-8 -*-

This only applies to literal characters in the python script file itself, not any string variables when the script is run.

test_noenc.py

somestr = u'über còól'
print somestr

test_enc.py

# -*- coding: utf-8 -*-
somestr = u'über còól'
print somestr

C:\Temp>python test_noenc.py
File "test_noenc.py", line 1
SyntaxError: Non-ASCII character '\xfc' in file test_noenc.py on line 1, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

C:\Temp>python test_enc.py
über còól

DanPatterson_Retired · ‎07-30-2015

Thanks Luke...fixed