<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: gdal.VectorTranslate() transforms non-ASCII characters? in Python Questions</title>
    <link>https://community.esri.com/t5/python-questions/gdal-vectortranslate-transforms-non-ascii/m-p/1674364#M75019</link>
    <description>&lt;P&gt;Okay, with the help of this &lt;A href="https://stackoverflow.com/questions/81154/how-do-i-determine-which-encoding-system-is-used-in-my-ms-access-database" target="_self"&gt;post&amp;nbsp;&lt;/A&gt;&amp;nbsp;I'm able to read it alright.&lt;/P&gt;&lt;LI-CODE lang="python"&gt;import codecs
inmdb = ogr.Open(in_ds)
sql = "Select TextString from LDAnno"
res = inmdb.ExecuteSQL(sql)
for r in res:
    t = r.GetFieldAsBinary(0).hex()
    t = codecs.decode(t, "hex").decode('dbcs')
    print(t)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;&amp;nbsp;S½NE¼ W½SE¼ E½SW¼ E½SW¼ W½NE¼ W½NE¼ S½NW¼ N½SW¼&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;The question remains of how do I force gdal to use that reading instead of doing its own thing?&lt;/P&gt;</description>
    <pubDate>Fri, 19 Dec 2025 19:49:50 GMT</pubDate>
    <dc:creator>AlfredBaldenweck</dc:creator>
    <dc:date>2025-12-19T19:49:50Z</dc:date>
    <item>
      <title>gdal.VectorTranslate() transforms non-ASCII characters?</title>
      <link>https://community.esri.com/t5/python-questions/gdal-vectortranslate-transforms-non-ascii/m-p/1674332#M75017</link>
      <description>&lt;P&gt;I'm trying a workflow using gdal.VectorTranslate(), since &lt;A href="https://community.esri.com/t5/python-questions/where-did-ogr2ogr-go/m-p/1651065" target="_self"&gt;ogr2ogr&lt;/A&gt; isn't there anymore.&lt;/P&gt;&lt;P&gt;I'm having an issue of the original data using non-ASCII characters, but they are replaced by&amp;nbsp;� when translated.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="AlfredBaldenweck_0-1766169224937.png" style="width: 400px;"&gt;&lt;img src="https://community.esri.com/t5/image/serverpage/image-id/145930i5FB3F00F95FBA0C8/image-size/medium?v=v2&amp;amp;px=400" role="button" title="AlfredBaldenweck_0-1766169224937.png" alt="AlfredBaldenweck_0-1766169224937.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="AlfredBaldenweck_1-1766169237494.png" style="width: 400px;"&gt;&lt;img src="https://community.esri.com/t5/image/serverpage/image-id/145931iC4A0E9A5C5CB21BF/image-size/medium?v=v2&amp;amp;px=400" role="button" title="AlfredBaldenweck_1-1766169237494.png" alt="AlfredBaldenweck_1-1766169237494.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;How can I make this not happen?&lt;/P&gt;&lt;P&gt;I tried &lt;A href="https://gis.stackexchange.com/questions/378199/setting-config-options-for-gdal-using-python" target="_self"&gt;setting&lt;/A&gt; the config options, but&amp;nbsp;&lt;A href="https://gdal.org/en/stable/user/configoptions.html#:~:text=OGR_FORCE_ASCII%3D%5BYES%E2%80%8B/%E2%80%8BNO%5D%3A" target="_self"&gt;"OGR_FORCE_ASCII" is only used by certain drivers and processes&lt;/A&gt;. Similarly, I cannot find a &lt;A href="https://gdal.org/en/stable/api/python/utilities.html#osgeo.gdal.VectorTranslateOptions" target="_self"&gt;Translate option&lt;/A&gt; that would appear to take care of this.&lt;/P&gt;&lt;P&gt;This is kind of a major thing. If I absolutely&amp;nbsp;&lt;EM&gt;have&lt;/EM&gt; to, I suppose I can get the strings from the original data, but that will majorly slow things down, not to mention complicate things.&amp;nbsp;&lt;BR /&gt;Thanks&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Edit it appears that this is dependent on the source; I have no problems going from fGDB to fGDB, but the real data is in an MDB.&lt;/P&gt;</description>
      <pubDate>Fri, 19 Dec 2025 18:42:03 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/gdal-vectortranslate-transforms-non-ascii/m-p/1674332#M75017</guid>
      <dc:creator>AlfredBaldenweck</dc:creator>
      <dc:date>2025-12-19T18:42:03Z</dc:date>
    </item>
    <item>
      <title>Re: gdal.VectorTranslate() transforms non-ASCII characters?</title>
      <link>https://community.esri.com/t5/python-questions/gdal-vectortranslate-transforms-non-ascii/m-p/1674358#M75018</link>
      <description>&lt;P&gt;It seems that pyodbc can read it just fine, which is super frustrating, since I can't figure out how to get it into a different format without having to download a new driver, which doesn't work if trying to distribute this workflow to various users (unless ?)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For context, this is what gdal is showing me for the strings&lt;/P&gt;&lt;PRE&gt;bytearray(b'S\xbdNE\xbc')&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;I'm not sure how some things can read these correctly as fractions but not others?&amp;nbsp;&lt;/P&gt;&lt;P&gt;I was able to convert that string to hex, which I fed to a converter online and got the desired output, and then the converter immediately broke when I tried doing it again.&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;53bd4e45bc&lt;/PRE&gt;&lt;P&gt;putting that string with the fractions in to the same converter gives me this&lt;/P&gt;&lt;PRE&gt;53 c2 bd 4e 45 c2 bc&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;As you can see, I'm missing some stuff here.&lt;/P&gt;&lt;P&gt;Doing a bytes.fromhex().decode() fails because of an "invalid start byte".&lt;/P&gt;&lt;P&gt;Kind of out of ideas here, so if anyone has any I'd really appreciate it&lt;/P&gt;</description>
      <pubDate>Fri, 19 Dec 2025 19:44:20 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/gdal-vectortranslate-transforms-non-ascii/m-p/1674358#M75018</guid>
      <dc:creator>AlfredBaldenweck</dc:creator>
      <dc:date>2025-12-19T19:44:20Z</dc:date>
    </item>
    <item>
      <title>Re: gdal.VectorTranslate() transforms non-ASCII characters?</title>
      <link>https://community.esri.com/t5/python-questions/gdal-vectortranslate-transforms-non-ascii/m-p/1674364#M75019</link>
      <description>&lt;P&gt;Okay, with the help of this &lt;A href="https://stackoverflow.com/questions/81154/how-do-i-determine-which-encoding-system-is-used-in-my-ms-access-database" target="_self"&gt;post&amp;nbsp;&lt;/A&gt;&amp;nbsp;I'm able to read it alright.&lt;/P&gt;&lt;LI-CODE lang="python"&gt;import codecs
inmdb = ogr.Open(in_ds)
sql = "Select TextString from LDAnno"
res = inmdb.ExecuteSQL(sql)
for r in res:
    t = r.GetFieldAsBinary(0).hex()
    t = codecs.decode(t, "hex").decode('dbcs')
    print(t)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;&amp;nbsp;S½NE¼ W½SE¼ E½SW¼ E½SW¼ W½NE¼ W½NE¼ S½NW¼ N½SW¼&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;The question remains of how do I force gdal to use that reading instead of doing its own thing?&lt;/P&gt;</description>
      <pubDate>Fri, 19 Dec 2025 19:49:50 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/gdal-vectortranslate-transforms-non-ascii/m-p/1674364#M75019</guid>
      <dc:creator>AlfredBaldenweck</dc:creator>
      <dc:date>2025-12-19T19:49:50Z</dc:date>
    </item>
    <item>
      <title>Re: gdal.VectorTranslate() transforms non-ASCII characters?</title>
      <link>https://community.esri.com/t5/python-questions/gdal-vectortranslate-transforms-non-ascii/m-p/1675537#M75037</link>
      <description>&lt;P&gt;&lt;A href="https://github.com/OSGeo/gdal/blob/9d2c301cb3e18d2fea3af32652d0a31de0447e10/apps/ogr2ogr_lib.cpp#L90" target="_blank"&gt;https://github.com/OSGeo/gdal/blob/9d2c301cb3e18d2fea3af32652d0a31de0447e10/apps/ogr2ogr_lib.cpp#L90&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;A href="https://gdal.org/en/stable/doxygen/classCPLStringList.html" target="_blank"&gt;https://gdal.org/en/stable/doxygen/classCPLStringList.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Seems like they're using a char array for strings? Could be that the source data isn't properly encoded, or is encoded as something that isn't utf-8 (latin1? cp1252?)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;You're really having a lot of fun issues with encoding lately huh&lt;/P&gt;</description>
      <pubDate>Thu, 01 Jan 2026 03:18:22 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/gdal-vectortranslate-transforms-non-ascii/m-p/1675537#M75037</guid>
      <dc:creator>HaydenWelch</dc:creator>
      <dc:date>2026-01-01T03:18:22Z</dc:date>
    </item>
    <item>
      <title>Re: gdal.VectorTranslate() transforms non-ASCII characters?</title>
      <link>https://community.esri.com/t5/python-questions/gdal-vectortranslate-transforms-non-ascii/m-p/1676143#M75040</link>
      <description>&lt;P&gt;It's ANSI, it appears.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm looking at finding all text fields, cycling through them, and then going through with an update cursor on the final product to update them to the correct values. Not 100% on how I'm going to do all that (for reasons I don't really want to get into I had to create a new unique ID field for each table during the Translate() process), but we're going to try. It'd be fine if they just brought over the values as-is and Pro couldn't read them, but they evaluate them for utf-8, freak out, and then change the values.&lt;/P&gt;</description>
      <pubDate>Tue, 06 Jan 2026 17:42:11 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/gdal-vectortranslate-transforms-non-ascii/m-p/1676143#M75040</guid>
      <dc:creator>AlfredBaldenweck</dc:creator>
      <dc:date>2026-01-06T17:42:11Z</dc:date>
    </item>
  </channel>
</rss>

