I have a standalone python script that uses multiprocessing and it is hitting an intermitent error when I use MakeFeatureLayer on a SQL Server table (which should support concurrent read access) across multiple processes. Similar problems have been reported here and here, but there are some differences and no clear solutions offered. The script runs fine with one worker process but fails intermitently when I go beyond that. Any ideas on why MakeFeatureLayer would fail like this? Am I wrong in assuming that is should work? Thanks
Here is the python code the recreates the problem:
import arcpy
import os
import multiprocessing
BEE_FC = r'C:\Users\dmorri28\Documents\ArcGIS\Projects\MyProject\CC-SQL2k16-ERC-SDE.sde\ROW_Habitat.SDE.USA_Bee_Occurrence'
MP_PROCESSORS = 2
NUM_TASKS = 20
NUM_ITERATIONS = 20
PID = os.getpid()
def run_mp (task_idx):
try:
lyr = arcpy.management.MakeFeatureLayer(BEE_FC, f"intersect_lyr_{PID}")[0]
arcpy.Delete_management(lyr.name)
except Exception as e:
print (f"PID: {PID} TASK: {task_idx} Exception: {str(e)}")
e.add_note(f"TASK: {task_idx}")
#raise
return
def run ():
for i in range(0,NUM_ITERATIONS):
print (f"Iteration: {i}")
p = multiprocessing.Pool(MP_PROCESSORS)
p.map(run_mp, list(range(0,NUM_TASKS)), 1)
p.close()
if __name__ == '__main__':
run ()
Here is the output - you can see it fails on iteration 1 and 17 plus there is also a mysterious licensing error that pops up every once in a while...
Iteration: 0
Iteration: 1
PID: 45800 TASK: 1 Exception: Failed to execute. Parameters are not valid.
ERROR 000732: Input Features: Dataset C:\Users\dmorri28\Documents\ArcGIS\Projects\MyProject\CC-SQL2k16-ERC-SDE.sde\ROW_Habitat.SDE.USA_Bee_Occurrence does not exist or is not supported
Failed to execute (MakeFeatureLayer).
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Iteration: 6
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\multiprocessing\spawn.py", line 122, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\multiprocessing\spawn.py", line 131, in _main
prepare(preparation_data)
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\multiprocessing\spawn.py", line 244, in prepare
_fixup_main_from_name(data['init_main_from_name'])
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\multiprocessing\spawn.py", line 268, in _fixup_main_from_name
main_content = runpy.run_module(mod_name,
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen runpy>", line 226, in run_module
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 88, in _run_code
File "c:\ROW\DEV\GitRepos\ROW_as_habitat\test\test_mp.py", line 1, in <module>
import arcpy
File "C:\Program Files\ArcGIS\Pro\Resources\ArcPy\arcpy\__init__.py", line 77, in <module>
from arcpy.geoprocessing import gp
File "C:\Program Files\ArcGIS\Pro\Resources\ArcPy\arcpy\geoprocessing\__init__.py", line 14, in <module>
from ._base import *
File "C:\Program Files\ArcGIS\Pro\Resources\ArcPy\arcpy\geoprocessing\_base.py", line 14, in <module>
import arcgisscripting
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\arcgisscripting\__init__.py", line 131, in <module>
from ._arcgisscripting import *
RuntimeError: The Product License has not been initialized.
Iteration: 7
Iteration: 8
Iteration: 9
Iteration: 10
Iteration: 11
Iteration: 12
Iteration: 13
Iteration: 14
Iteration: 15
Iteration: 16
Iteration: 17
PID: 35152 TASK: 0 Exception: Failed to execute. Parameters are not valid.
ERROR 000732: Input Features: Dataset C:\Users\dmorri28\Documents\ArcGIS\Projects\MyProject\CC-SQL2k16-ERC-SDE.sde\ROW_Habitat.SDE.USA_Bee_Occurrence does not exist or is not supported
Failed to execute (MakeFeatureLayer).
Iteration: 18
Iteration: 19This is a windows system running ArcGIS Pro 3.4.0 under a Named User License that is authorized to work offline.
Is this related?
Parallel Processing Factor (Environment setting)—ArcGIS Pro | Documentation
under usage notes and the sql reference
Good point - I hadn't seen that but we are running "Microsoft SQL Server Enterprise" and it looks like the documented restriction only applies to "SQL Server Express". In addition I set the arcpy.env.parallelProcessingFactor value to "0" in the worker processes since I don't want both my code and the ESRI tools trying to spread the workload across multiple processors.
I have had absolutely no luck getting any arcypy passthrough functions to work with multiprocessing. Its always the "The Product License has not been initialized." RuntimeError too.
No matter what license format I use. I'm hoping that the upgrade to 3.14 soon will solve that with multiple interpreters being allowed in a single process, because I'm thinking it's something about how Python forks into a new process that's not compatible with their license check.
I have been able to get Cursors to work and I have also directly modified CIM elements with multiprocessing, but the global arc_object that's initialized on arcpy import seems to lock out multiprocessing code for now.
I might try messing with sharing arcpy as a global state for the processes. Possibly maintain a lock on the gp object that's created in arcpy initialization, or just store the globals() of the first run in a cache and pass that as context to the Pool executor.
Either way, making it work isn't simple and the stack trace is brick walked as soon as it hits the cpython API calls.