I have written a script that that (should) step through a list of all the tables in a mySql database and write the contents of each to a csv table. I've used this basic construct successfully in the past but with specified tables. This time I'm looping through all the tables in the given database and I'm encountering Unicode errors. For example:
UnicodeEncodeError: 'charmap' codec can't encode character '\x9d' in position 646: character maps to <undefined>
The script roles along and creates a couple dozen csv files without a hitch so I looked at what has been created and figured the next table in the list is the culprit. But I get a different unicode error when I run the script on just that table:
UnicodeEncodeError: 'charmap' codec can't encode character '\u0400' in position 1255: character maps to <undefined>
Clearly I stand to see these sorts of errors every time through the loop. Is there way to trap these errors and then exclude the offending character from the output?
Here's the code, head to toe:
# -*- coding: utf-8 -*-
"""
Created on Tue Dec 22 12:37:49 2020
@author: jborgione
"""
import mysql.connector as mysql
from mysql.connector import FieldType
import csv,os
outFolder = r'N:\GIS\E360'
def createCSV(table):
cn = mysql.connect(user = 'users', password = 'password', host = 'host', database = 'master_db')
cursor = cn.cursor()
table = 'complaints'
query = f'SELECT * from {table};'
cursor.execute(query)
rows = cursor.fetchall()
column_names = [i[0] for i in cursor.description]
dotcsv = f'{table}.csv'
outFile = open(os.path.join(outFolder,dotcsv),'w',newline ='')
myFile = csv.writer(outFile)
myFile.writerow(column_names)
myFile.writerows(rows)
cursor.close
def main():
tableList = []
cn = mysql.connect(user = 'user', password = 'password', host = 'host', database = 'master_db')
cursor = cn.cursor()
allTables = "show tables"
cursor.execute(allTables)
for (allTables) in cursor:
tableList.append(allTables[0])
cursor.close
for table in tableList:
createCSV(table)
if __name__ == '__main__': #if def main exists, call it
main()
After some google fishing I added a couple of things that got me past the errors:
''' not entirely sure what this does but I remember it from an earlier post I made sometime back using python 3.6, the old reload(sys) does not work'''
import importlib
importlib.reload(sys)
''' added encoding = 'utf-8', errors = 'ignore''''
outFile = open(os.path.join(outFolder,dotcsv),'w',encoding = 'utf-8', errors = 'ignore', newline ='')
csv and other crossover modules aren't unicode compliant by default
python - How to write UTF-8 in a CSV file - Stack Overflow
has this ditty
import csv
with open('output_file_name', 'w', newline='', encoding='utf-8') as csv_file:
writer = csv.writer(csv_file, delimiter=';')
writer.writerow('my_utf8_string')
csv and other crossover modules aren't unicode compliant by default
Good to know!
More details,,,
csv — CSV File Reading and Writing — Python 3.9.1 documentation
just flip back to the python version (3.6 if using Pro 2.5/6 or 3.7 for Pro 2.7)
This is totally un-ArcGIS Pro for now; just pure python...