UnicodeEncodeError

JoeBorgione · ‎12-22-2020

I have written a script that that (should) step through a list of all the tables in a mySql database and write the contents of each to a csv table. I've used this basic construct successfully in the past but with specified tables. This time I'm looping through all the tables in the given database and I'm encountering Unicode errors. For example:

UnicodeEncodeError: 'charmap' codec can't encode character '\x9d' in position 646: character maps to <undefined>

The script roles along and creates a couple dozen csv files without a hitch so I looked at what has been created and figured the next table in the list is the culprit. But I get a different unicode error when I run the script on just that table:

UnicodeEncodeError: 'charmap' codec can't encode character '\u0400' in position 1255: character maps to <undefined>

Clearly I stand to see these sorts of errors every time through the loop. Is there way to trap these errors and then exclude the offending character from the output?

Here's the code, head to toe:

# -*- coding: utf-8 -*-
"""
Created on Tue Dec 22 12:37:49 2020

@author: jborgione
"""

import mysql.connector as mysql
from mysql.connector import FieldType
import csv,os



outFolder = r'N:\GIS\E360'            


def createCSV(table):
    cn = mysql.connect(user = 'users', password = 'password', host = 'host', database = 'master_db')
    cursor = cn.cursor()
    table = 'complaints'
    query = f'SELECT * from {table};'
                          
                                     
    cursor.execute(query)
            
    rows = cursor.fetchall()
    column_names = [i[0] for i in cursor.description]
    dotcsv = f'{table}.csv'
    
    outFile = open(os.path.join(outFolder,dotcsv),'w',newline ='')
    myFile = csv.writer(outFile)
    myFile.writerow(column_names)
    myFile.writerows(rows)
    cursor.close




def main():
    tableList = []
    cn = mysql.connect(user = 'user', password = 'password', host = 'host', database = 'master_db')
    cursor = cn.cursor()
    allTables = "show tables"
    cursor.execute(allTables)
    for (allTables) in cursor:
        tableList.append(allTables[0])
    cursor.close
    
    for table in tableList:
        createCSV(table)
 
if __name__ == '__main__':  #if def main exists, call it
    main()

That should just about do it....

JoeBorgione · ‎12-22-2020

After some google fishing I added a couple of things that got me past the errors:

''' not entirely sure what this does but I remember it from an earlier post I made sometime back using python 3.6, the old reload(sys) does not work'''

import importlib
importlib.reload(sys)

''' added encoding = 'utf-8', errors = 'ignore''''
outFile = open(os.path.join(outFolder,dotcsv),'w',encoding = 'utf-8', errors = 'ignore', newline ='')

That should just about do it....

DanPatterson · ‎12-22-2020

csv and other crossover modules aren't unicode compliant by default

python - How to write UTF-8 in a CSV file - Stack Overflow

has this ditty

import csv

with open('output_file_name', 'w', newline='', encoding='utf-8') as csv_file:
    writer = csv.writer(csv_file, delimiter=';')
    writer.writerow('my_utf8_string')

... sort of retired...

JoeBorgione · ‎12-22-2020

csv and other crossover modules aren't unicode compliant by default

Good to know!

That should just about do it....

DanPatterson · ‎12-22-2020

More details,,,

csv — CSV File Reading and Writing — Python 3.9.1 documentation

just flip back to the python version (3.6 if using Pro 2.5/6 or 3.7 for Pro 2.7)

... sort of retired...

JoeBorgione · ‎12-22-2020

This is totally un-ArcGIS Pro for now; just pure python...

That should just about do it....