<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Large Dictionary Compression? in Python Questions</title>
    <link>https://community.esri.com/t5/python-questions/large-dictionary-compression/m-p/26162#M1958</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;What about using SQLite inside Python? This might manage data better and you can run an SQL query to do the matching instead of a dictionary.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;SQLite is built into python and there are no 2GB size limits. Does it load everything into memory? &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Attached is an example using SQLite to find duplicates in a large database where python dictionaries overflowed. (Not written by me)&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Wed, 30 May 2012 07:08:03 GMT</pubDate>
    <dc:creator>KimOllivier</dc:creator>
    <dc:date>2012-05-30T07:08:03Z</dc:date>
    <item>
      <title>Large Dictionary Compression?</title>
      <link>https://community.esri.com/t5/python-questions/large-dictionary-compression/m-p/26157#M1953</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;I have a simple dictionary like this:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;exampleDict[123444556] = (1785,2234544,3545456, 165765.47654)&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;where all the keys are intgers and the values are either integers or floats.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;My issue is that I have the need to store/access about 20 million keys at a time, and I am running out of 32-bit memory. I'd rather do this in 32-bit Python as I need (or would like) access to arcpy for its FGDB table reading/writting abilities.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Anyone know of a way to somehow "compress" keys and/or values in a dictionary? I'm looking into the binascii module, and I see lots of methods to compress strings, but not ints or floats. Maybe you can't meaningfully compress these since they are already quite numeric?&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Anyone ever do something like this?&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 23 May 2012 17:34:39 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/large-dictionary-compression/m-p/26157#M1953</guid>
      <dc:creator>ChrisSnyder</dc:creator>
      <dc:date>2012-05-23T17:34:39Z</dc:date>
    </item>
    <item>
      <title>Re: Large Dictionary Compression?</title>
      <link>https://community.esri.com/t5/python-questions/large-dictionary-compression/m-p/26158#M1954</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;If you're running out of memory, you're sort of out of luck because internally integers are already stored as space-efficiently as possible.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;You might want to consider some other key-value store, such as &lt;/SPAN&gt;&lt;A href="http://docs.python.org/library/anydbm.html"&gt;anydbm&lt;/A&gt;&lt;SPAN&gt; or even setting up a &lt;/SPAN&gt;&lt;A href="http://redis.io/"&gt;Redis server&lt;/A&gt;&lt;SPAN&gt; and &lt;/SPAN&gt;&lt;A href="https://github.com/andymccurdy/redis-py"&gt;talking to that from python&lt;/A&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 23 May 2012 17:59:23 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/large-dictionary-compression/m-p/26158#M1954</guid>
      <dc:creator>JasonScheirer</dc:creator>
      <dc:date>2012-05-23T17:59:23Z</dc:date>
    </item>
    <item>
      <title>Re: Large Dictionary Compression?</title>
      <link>https://community.esri.com/t5/python-questions/large-dictionary-compression/m-p/26159#M1955</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Thanks Jason - I'll look into those...&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 23 May 2012 18:02:52 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/large-dictionary-compression/m-p/26159#M1955</guid>
      <dc:creator>ChrisSnyder</dc:creator>
      <dc:date>2012-05-23T18:02:52Z</dc:date>
    </item>
    <item>
      <title>Re: Large Dictionary Compression?</title>
      <link>https://community.esri.com/t5/python-questions/large-dictionary-compression/m-p/26160#M1956</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Jason, after looking at stuff... Hmmm - seems a bit over my head I think.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;But my work around solution (not working quite 100% yet) is to just:&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;1. export the FGDB tables to .txt format (thankfully the txt versions are &amp;lt; 2GB!).&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;2. call 64-bit Python.exe as a subprocess (which actually seems to work)&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;3. have that 64-bit python.exe process read the "tables" (txt files) into dictionaries, do the analysis, write the results out to .txt format&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;4. back in 32-bit "arcpy-compliant" Python land, read the analysis txt table back into FGDB table format, and then *** big inhale *** proceed with the rest of the script.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Here's to 64-bit :cool: and the hope that we may be have a 64-bit version of ArcGIS some day!&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 24 May 2012 18:09:03 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/large-dictionary-compression/m-p/26160#M1956</guid>
      <dc:creator>ChrisSnyder</dc:creator>
      <dc:date>2012-05-24T18:09:03Z</dc:date>
    </item>
    <item>
      <title>Re: Large Dictionary Compression?</title>
      <link>https://community.esri.com/t5/python-questions/large-dictionary-compression/m-p/26161#M1957</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Nice! Glad you got something working. 10.1 server will be 64 bit out of the box.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 24 May 2012 19:05:43 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/large-dictionary-compression/m-p/26161#M1957</guid>
      <dc:creator>JasonScheirer</dc:creator>
      <dc:date>2012-05-24T19:05:43Z</dc:date>
    </item>
    <item>
      <title>Re: Large Dictionary Compression?</title>
      <link>https://community.esri.com/t5/python-questions/large-dictionary-compression/m-p/26162#M1958</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;What about using SQLite inside Python? This might manage data better and you can run an SQL query to do the matching instead of a dictionary.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;SQLite is built into python and there are no 2GB size limits. Does it load everything into memory? &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Attached is an example using SQLite to find duplicates in a large database where python dictionaries overflowed. (Not written by me)&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 30 May 2012 07:08:03 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/large-dictionary-compression/m-p/26162#M1958</guid>
      <dc:creator>KimOllivier</dc:creator>
      <dc:date>2012-05-30T07:08:03Z</dc:date>
    </item>
    <item>
      <title>Re: Large Dictionary Compression?</title>
      <link>https://community.esri.com/t5/python-questions/large-dictionary-compression/m-p/26163#M1959</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;That looks very interesting Kim, although I don't have much hardcore SQL skill...&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I think for my purposes I will stick with my Python 64-bit subprocess solution... I am using these large dictionaries to traverse/trace segments of a stream network, and speed is very critical as there are so many features involved - eventually there will be 100's of millions of features. I am comfortable writting my own code in Python to emulate fancy SQL-type stuff using dictionaries and basisically see dictionaries as a great and flexible format for creating my own RDBMS with whatever "custom" features I can dream up. I am amazed at the speed of these hash table-type structures - and I seem that the code you suplied uses some sort of formal SQL hash functionality (sadly, of which I am totally ignorant of!) - very cool.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 31 May 2012 16:19:08 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/large-dictionary-compression/m-p/26163#M1959</guid>
      <dc:creator>ChrisSnyder</dc:creator>
      <dc:date>2012-05-31T16:19:08Z</dc:date>
    </item>
    <item>
      <title>Re: Large Dictionary Compression?</title>
      <link>https://community.esri.com/t5/python-questions/large-dictionary-compression/m-p/26164#M1960</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;You could also take a look at the shelve module - &lt;/SPAN&gt;&lt;A href="http://docs.python.org/library/shelve.html"&gt;http://docs.python.org/library/shelve.html&lt;/A&gt;&lt;SPAN&gt; It provides a filesystem based dict like class. Though as it's filesystem based, it will probably be slower than your 64bit python subprocess method.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 01 Jun 2012 01:40:45 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/large-dictionary-compression/m-p/26164#M1960</guid>
      <dc:creator>Luke_Pinner</dc:creator>
      <dc:date>2012-06-01T01:40:45Z</dc:date>
    </item>
    <item>
      <title>Re: Large Dictionary Compression?</title>
      <link>https://community.esri.com/t5/python-questions/large-dictionary-compression/m-p/26165#M1961</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Decided to finally install and test out the new 64 bit geoprocessing upgrade for 10.1 SP1. Works like a charm (except for the whole 32-bit exceptions thing, but that's okay and understandable... I never liked PGDB anyway!). Note the RAM usage in the attached screenshot (~27 GB max in use). So I can now have my huge Python dictionaries and eat arcpy too. I bet this was Jason S.' idea - thanks for implementing :).&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;[ATTACH=CONFIG]22090[/ATTACH]&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 22 Feb 2013 16:48:49 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/large-dictionary-compression/m-p/26165#M1961</guid>
      <dc:creator>ChrisSnyder</dc:creator>
      <dc:date>2013-02-22T16:48:49Z</dc:date>
    </item>
  </channel>
</rss>

