arcpy.Dissolve_management(in_fc, out_fc, myField, "#", "MULTI_PART", "DISSOLVE_LINES")
def dissolveInSubprocess(in_fc, out_fc, dissolve_fields, stat_fields = '#', multi_part = 'MULTI_PART', unsplit_lines = 'DISSOLVE_LINES', xytol = '#'): """Execute Dissolve tool in a subprocess""" opts = (in_fc, out_fc, dissolve_fields, stat_fields, multi_part, unsplit_lines, xytol) cmd = r"C:\Python27\ArcGIS10.1\python.exe C:\Scripts\dissolver.py " + " ".join(map(str, opts)) fn.msg("Executing subprocess with " + str(cmd)) chld = subprocess.Popen(cmd, stdout=subprocess.PIPE) r = chld.communicate() print r if chld.returncode != 0: raise Exception("Error while dissolving in subprocess:" + str(chld.returncode)) else: fn.msg("Finished subprocess with " + str(cmd)) return chld.returncode # �?� then I call the function dissolveInSubprocess(in_fc, out_fc, ";".join(list_of_cols), stat_fields ='#', multi_part = 'MULTI_PART', unsplit_lines = "DISSOLVE_LINES", xytol = '#')
"""Script to execute dissolve in a separate subprocess.""" def main(in_fc, out_fc, dissolve_fields, stat_fields, multi_part, unsplit_lines, xytol): import arcpy if xytol[0].isdigit(): arcpy.env.XYTolerance = xytol try: estuarycatchmentsRaw = arcpy.Dissolve_management(in_fc, out_fc, dissolve_fields, stat_fields , multi_part, unsplit_lines).getOutput(0) except Exception as ex: print ex.message return 0 import sys args = sys.argv print args script, in_fc, out_fc, dissolve_fields, stat_fields, multi_part, unsplit_lines, xytol = args main(in_fc, out_fc, dissolve_fields, stat_fields, multi_part, unsplit_lines, xytol)
Today I learned a new fact relevant to dissolving or intersecting large datasets.
ArcGIS Pro has Parallel Overlay toolset with Parallel Dissolve and Parallel Intersect tools that I hope address this issue.
I haven't tried these tools yet but the help looks promising.
Filip.
Thank you very much for taking the time to write this answer. I found the same bug in Dissolve_management and it wasted a lot of my employer's time. The insidious semi-random nature of these memory management bugs makes them extremely difficult to debug because setting up a test case that fails reliably is difficult.
Your solution of running in a separate process is the only thing I found that worked (and like you I tried many other things). For me there is an overhead of about 6 seconds per extra process call while the other process imports arcpy, but it is at least a solution!
I also found 32-bit background geoprocessing more reliable than 64-bit (which is simply broken for me). Now I just call all geoprocessing tasks in their own separate processes (like clip and erase). You can't use Feature Layers this way so it takes a little longer to write anything into a Feature Class but I choose reliability over speed any time.
This sort of bug can really hurt your professional reputation if you are required to perform some ostensibly simple tasks for a tight deadline. They should not still be in the wild.
I was seeing similar behavior dissolving large (but not that large) Raster To Polygon outputs. The tool would simply hang, and then many hours later crash with a obviously "broke" type of error message.
Some of the features were discontiguous areas of the same value and many were just one per value. My solution was to split the data into parts: 1) count all the polygons by GRID_CODE and copy the 'single-parts' (frequency = 1) to another dataset, then 2) dissolve 'multi-parts' polygons at a time in a loop, 5000 at a time, then 3) appended them all together. Kludge but it does (usually) work.
Thanks, Filip and Mathew for the additional insight and maybe a better approach. For my particular problem (this is a script that will be run a LOT) my current approach, but putting the dissolve chunks in multiprocessing would run fast and consistently complete.
Alex I second your crankiness. That said: I strongly urge you all to send these problematic datasets to Esri Support so they can be added to test datasets Esri can use in their next development cycle. We are the best source of test datasets for them as we work with real data every day. The problem with this one is sometimes they work, sometimes they don't -- depends on the memory state when you launch the tool!
Just a side note, you can get the path in a version independent way by replacing this:
cmd = r"C:\Python27\ArcGIS10.1\python.exe"
with
cmd = os.path.join(sys.prefix, "python.exe")
or, if you want it to run 32 bit even if in launched from arcpy x64:
cmd = os.path.join(sys.prefix.replace("x64",""), "python.exe")
Filip, here's how that would work using multiprocessing.
http://stackoverflow.com/a/2046630/2234229
Unfortunately, I have been unsuccessful using multiprocessing in a script launched from Desktop. For a tool to be really useful I want to be able to run it from there (as well as from a standalone script). I currently have a case going with Esri on this that clearly has one of their best people on the case... when I get an answer I will put a blog post together.