Select to view content in your preferred language

Multiprocessing won't create feature classes

2296
25
04-01-2022 01:12 PM
Jay_Gregory
Regular Contributor

I'm trying to implement a fairly straightforward multiprocessing script that takes every polygon in a shapefile, tessellates into a grid, takes the center points of each grid square, and then clips the points by the original feature (since the tessellation generates a minimum bounding rectangle).  The tessellation feature class is never created so next step (FeatureToPoint) fails.  I don't understand - can someone help?

Even if I comment out all the arcpy commands except GenerateTessellation, the output is never actually created.  

 

import multiprocessing
import os
import time
import arcpy

startTime = time.time()

basedir = r"C:\Users\Jay\Documents\ArcGIS\Projects\DataPrep"
scratchGDB = 'C:\\Jay\\data\data.gdb'
output = os.path.join(basedir, "grids.gdb")
features = os.path.join(basedir,"features.shp")

def multifunction(ft):
    import arcpy
    name = ft[0]
    extent = ft[1].extent
    print('Working on {}'.format(name))
    fl = arcpy.management.MakeFeatureLayer(features, "{}fl".format(name), where_clause=f"Name='{name}'")
    gdb = os.path.join(basedir, "{}.gdb".format(name))
    arcpy.management.CreateFileGDB(basedir, "{}.gdb".format(name))
    output=os.path.join(gdb, "{}grid".format(name))
    grid = arcpy.management.GenerateTessellation(output, Extent=extent, Shape_Type="SQUARE", Size="900 SquareMeters")
    print('tellessation')
    pointgrid = arcpy.management.FeatureToPoint(grid, os.path.join(scratchGDB, "{}pointgrid".format(apt)))
    arcpy.analysis.Clip(pointgrid, fl, os.path.join(output, f"{apt}Points"))
    arcpy.management.Delete(fl)
    arcpy.management.Delete(grid)
    arcpy.management.Delete(pointgrid)

def main():
    processList = [feature for feature in arcpy.da.SearchCursor(features, ['Name', 'SHAPE@'])]
    pool = multiprocessing.Pool(1)
    pool.map(multifunction, processList[0:10])
    pool.close()
    pool.join()

if __name__ == "__main__":
    print("Running")
    main()
    executionTime = (time.time() - startTime)
    print('Execution time in seconds: ' + str(executionTime))

 

0 Kudos
25 Replies
by Anonymous User
Not applicable

Python's multiprocessing is frustrating. What does

 

processList[0:10]

 

 processList[0:10] look like? 

The the arguments that you are passing may be unpacked as individual arguments so you may need to pass your tuple as (processList[0:10], ) or create each pair into the (args, ) format when you create the list.

 

itemPairs = [([ft[0], ft[1]],) for ft in da.searchCursor()]

 

Hard to say without seeing the data. 

Since the data that you want to pass to the worker has to be a built in data type/ structure. Geometry is an arcpy data type/ object, so you need to create a pickle-able class for it to convert it to bytes to be able to pass into the multiprocessing stuff.

Additionally, you should adjust the return output to return the exception as well.  Your not doing anything with the return result so its hard to know what the issue is.

 

# global level variable
res = []

try:
     GenerateTessellation_management(output, Extent=extent, Shape_Type="SQUARE", Size="900 SquareMeters")
    res.append({'result': 'success', 'msg': 'It worked'})

except Exception as err:
    print("error", err)
    res.append({'result': 'error', 'msg': err})
return res

 

then go over the results like this:

 

with Pool(processes=4) as pool:
     result = pool.starmap(multifunction, processList[0:10])

for res in result:
   print(f'{res["result"]} {res["msg"]}')

 

0 Kudos
by Anonymous User
Not applicable

Looks like we overlapped posts. If that is case, I'd work on returning the error message from the worker using the example code in my last post and going from there.

0 Kudos
DuncanHornby
MVP Notable Contributor

I would speculate the source of your problem is what you are trying to pass into your worker function, as @Anonymous User says geometry is not pickleable. Just pass in the OID and then do the getting hold of a geometry and its extent within the worker function.  I worked up a template for people and wrote a short blog here.

Jay_Gregory
Regular Contributor

@DuncanHornby @Anonymous User @DanPatterson I changed my setup so the extent is passed in as a "Space delimited string of coordinates" per the Generate Tessellation documentation (https://pro.arcgis.com/en/pro-app/2.8/tool-reference/data-management/generatetesellation.htm).  So at this point I'm just passing in an array with two strings to my multiprocessing pool function and no errors are thrown but the output is not created.  

If I replace the generate tellessation line wtih CreateFeatureclass, output is generated.  

I even tried hard coding the extent, hard coding the spatial reference as a string instead of an arcpy method, and nothing works.  I have been getting other multiprocessing code to work too, so I think there is something in the GenerateTessellation method that is causing this (but no errors are thrown).  At this point I have gone over every possible thing I could think of except reinstall Pro.

If someone wants to validate they can create a function with just one line.  Set your output to be some shapefile, and then run it once as a multiprocessing function.  Does it create the grid?

arcpy.management.GenerateTessellation(output, "-86.809667396 33.5218054870001 -86.7009855469999 33.6051117490001", "SQUARE", "900 SquareMeters", 'GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137.0,298.257223563]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]];-400 -400 1000000000;-100000 10000;-100000 10000;8.98315284119521E-09;0.001;0.001;IsHighPrecision')
 
by Anonymous User
Not applicable

It created in my environment. I'm running Pro 2.9.2.

 

ABishop
MVP Regular Contributor

Hello Jeff,

I didn't see that this was solved yet, so I thought I would give this a try and offer up another idea?

Have you tried building this multiprocessing workflow into a model in ModelBuilder?  There are tools in ModelBuilder that may be able to iterate your processes without all the tedious code.  Once you get it working you can export the script for automation.

Amanda Bishop, GISP
0 Kudos
Jay_Gregory
Regular Contributor

Thanks @ABishop  I'm comfortable doing this in ModelBuilder if I needed to.  My issue is I have a process I would like to automate that takes over an hour, and would like to improve the performance using the multiprocessing module in Python like I have with some other projects.  

ABishop
MVP Regular Contributor

It's no problem.  I just thought I would put it out there!  Sometimes when I am having trouble with my scripts, it helps me to see it from another viewpoint or platform.  Good luck!

Amanda Bishop, GISP
0 Kudos
Jay_Gregory
Regular Contributor

@Anonymous User Would you mind posting your code.  I just upgraded to 2.9.2 and still nothing. 

Mine is as simple as I could make it - still no errors and no output either.

import multiprocessing
from processRun import multifunction

def main():
    pool = multiprocessing.Pool(processes=4)
    pool.map(multifunction, range(0,1))
    pool.close()
    pool.join()

if __name__ == "__main__":
    main()

My processRun file looks like:

import arcpy
def multifunction(num):
    arcpy.management.GenerateTessellation("C:\\Users\\jay\\Documents\\test.shp", "-86.809667396 33.5218054870001 -86.7009855469999 33.6051117490001", "SQUARE", "900 SquareMeters", 'GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137.0,298.257223563]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]];-400 -400 1000000000;-100000 10000;-100000 10000;8.98315284119521E-09;0.001;0.001;IsHighPrecision')

If yours is working, and I can't get it to work on my machine I'll probably just give up.  

by Anonymous User
Not applicable

Shoot, yes, I thought you wanted to test just the line so that's what I did and it generated the square. Throwing it into the mp mix, it fails with the false positive:

Succeeded at Thursday, April 7, 2022 10:51:14 AM (Elapsed Time: 0.03 seconds)

I would open a case with ESRI if you could.