Problems with encoding

752
2
Jump to solution
08-18-2022 02:35 AM
MaozTreeCare
New Contributor II

Hello,

I'm using this script in order to get all my assigments.

https://github.com/Esri/workforce-scripts/blob/master/readmes/export_assignments_to_csv_readme.md

Some of the assignments text is in Hebrew.

The text i'm getting is:

׳ ׳©׳œ׳—׳” ׳₪׳ ׳™׳”: 28/04/2022 06:43׳¢׳¨׳ž׳” ׳©׳œ ׳’׳–׳ ׳¢׳¦׳™׳ ׳©׳œ ׳’׳ ׳™׳ ׳•׳ ׳•׳£

 

What I did:

I'm using 

# -*- coding: utf-8 -*-

On the top of my python file.

I have opened the file with notepad++ and converted it to UTF-8 without BOM.

I'm using UTF-8 encoding while reading the file:

with open(arguments.csv_file, 'w', newline='', encoding='utf-8') as csv_file:

 

Is there anything i'm missing here?

Thank you

 

 

0 Kudos
1 Solution

Accepted Solutions
MaozTreeCare
New Contributor II

Thank you!

I have managed to figure out the issue was not with GIS or python at all... but the Excel!

It did not open UTF-8 CSV files corretly...

I had to change the open code to 

with open(arguments.csv_file, 'w', newline='', encoding='utf-8-sig') as csv_file:

only using utf-8-sig' (BOM) excel was able to figure out that the file is in UTF-8 format.

I should have checked the file itself, the text was written correctly...

Thank you

View solution in original post

0 Kudos
2 Replies
ShaunWalbridge
Esri Regular Contributor

The header of the Python file only affects the encoding of the text within the script, so probably is unrelated. Opening the input with 'utf-8' looks correct, I would check that the database is correctly storing the information as UTF-8 (e.g. it wasn't reencoded into something else during storage), and you can stop your script at the point of execution to check what Python thinks its seeing. If you add `breakpoint()` to your code, you'll get the PDB debugger which will allow you to inspect the state.

MaozTreeCare
New Contributor II

Thank you!

I have managed to figure out the issue was not with GIS or python at all... but the Excel!

It did not open UTF-8 CSV files corretly...

I had to change the open code to 

with open(arguments.csv_file, 'w', newline='', encoding='utf-8-sig') as csv_file:

only using utf-8-sig' (BOM) excel was able to figure out that the file is in UTF-8 format.

I should have checked the file itself, the text was written correctly...

Thank you

0 Kudos