Arcpy concurrent.futures license issues

RyJohnsen · ‎03-20-2025

I have a script to process a bunch of polygons. The script uses nested ProcessPoolExecutors. After a certain point, I get this error:

Exception has occurred: RuntimeError (note: full exception trace is shown but execution is paused at: <module>)
The Product License has not been initialized.
File "C:\Program Files\ArcGIS\Pro\Resources\ArcPy\arcpy\geoprocessing\_base.py", line 14, in <module>
import arcgisscripting
File "C:\Program Files\ArcGIS\Pro\Resources\ArcPy\arcpy\geoprocessing\__init__.py", line 14, in <module>
from ._base import *
File "C:\Program Files\ArcGIS\Pro\Resources\ArcPy\arcpy\__init__.py", line 77, in <module>
from arcpy.geoprocessing import gp
File "C:\Users\redacted\example.py", line 1, in <module>
import arcpy
File "<string>", line 1, in <module> (Current frame)
RuntimeError: The Product License has not been initialized.

Things I have tried yet still run into this issue:

Using the default ArcGIS pro python environment or using a cloned environment.
If I have ArcGIS pro currently open and logged in as well as when it is closed.
Using my named user license or checking it out for offline use.
Having the child process pool in a different python file and importing it.

The only "solution" I've found is to have the total number of processes (max_concurrency * max_children) stay below the mid 20's even on machines with enough cores and ram.

Here is an example script:

import arcpy
import itertools
import uuid
import time
import concurrent.futures
import random

arcpy.env.overwriteOutput = True
arcpy.env.workspace = Point towards your Geodatabase

output = "test_number"

max_concurrency  = 20
max_children = 20

def process_number(number):
    time.sleep(random.random() * 3)
    return number

def process_numbers_multi(number, num_range):
    print(f"PROCESSING STARTED ON NUMBER: {number}")
    
    nums = list(range(number, number+num_range))
    input_handler = iter(nums)
    num_results = []
    
    with concurrent.futures.ProcessPoolExecutor(max_workers=max_children) as child_executor:
            futures = {
                child_executor.submit(process_number, part): part
                    for part in itertools.islice(input_handler, max_children)
            }
            while futures:
                done, _ = concurrent.futures.wait(
                    futures, return_when=concurrent.futures.FIRST_COMPLETED
                )
                for fut in done:
                    original_input = futures.pop(fut)
                    try:
                        results = []
                        results = fut.result()
                    except Exception as exc:
                        print(f"{original_input} generated an exception: {exc}")
                    else:
                        num_results.append(results)
                            
                if any(input_handler):
                    for part in itertools.islice(input_handler, len(done)):
                        fut = child_executor.submit(process_number, part)
                        futures[fut] = (part)
    return num_results

def create_feature_class():
    # Create transect FC. Add Fields.
    trans_fc = arcpy.management.CreateFeatureclass(out_path=arcpy.env.workspace,
                                    out_name=output)
    flds = [("NUMBER_GUID", "GUID"), ("NUMBER", "DOUBLE")]
    for fld_name, fld_type in flds:
        arcpy.management.AddField(in_table=trans_fc, field_name=fld_name,
                                field_type=fld_type, field_length=1)
    print("CREATED TRANSECT FC")
    return trans_fc

def write_to_db(number, trans_fc):
    flds = ["NUMBER_GUID", "NUMBER"]
    print(f"WRITING NUMBER")

    for rows in number:
        with arcpy.da.InsertCursor(trans_fc, flds) as icurs:
            icurs.insertRow([rows[0],rows[1]])
    
def main():
    startTime = time.time()
    print("PROCESS STARTING")
    
    trans_fc = create_feature_class()
    num_list = list(range(0,200))
    input_handler = iter(num_list)
    num_range = 10
    full_results = []
    
    """tmp = process_numbers_multi(num_list[0], num_range)
    for i in tmp:
        full_results.append([uuid.uuid4(), i])
    write_to_db(full_results, trans_fc)"""
    with concurrent.futures.ProcessPoolExecutor(max_workers=max_concurrency ) as executor:
            futures = {
                executor.submit(process_numbers_multi, part, num_range): part
                    for part in itertools.islice(input_handler, max_concurrency )
            }
            while futures:
                done, _ = concurrent.futures.wait(
                    futures, return_when=concurrent.futures.FIRST_COMPLETED
                )

                for fut in done:
                    original_input = futures.pop(fut)
                    try:
                        results = []
                        results = fut.result()
                    except Exception as exc:
                        print(f"{original_input} generated an exception: {exc}")                    
                    else:
                        for x in results:
                            full_results.append([uuid.uuid4(), x])
                        write_to_db(full_results, trans_fc)
                        full_results = []
                    
                if any(input_handler):
                    for part in itertools.islice(input_handler, len(done)):
                        fut = executor.submit(process_numbers_multi, part, num_range)
                        futures[fut] = (part)

    endTime = time.time()
    print("PROCESS COMPLETE")
    print(f"Elapsed Time: {endTime - startTime}")
    return True

if __name__ == '__main__':
    main()

If you change the max_concurrency and max_children values it will be less likely to occur, but even when max_concurrency*max_children <= os.cpu_count() I've had this error occur. It just seems more reliable to trigger with a larger number of processes.

Robert_LeClair · ‎03-20-2025

There was a similar case when a user was trying to run a Python script via Task Scheduler and getting the Runtime error message. The way they resolved the issue was to do the following:

- Navigate to the Project tab> Package Manager
- Click on the 3 dots>click on clone
- It might take some time,
- Activate the cloned environment from the 3 dots> Activate >reopened ArcGIS Pro and re-ran the entire script, and the script executed successfully without any error

RyJohnsen · ‎03-25-2025

Sadly, this did not work.

DavidSolari · ‎03-20-2025

When I tried running the code I got 20 processes in (and 200-something VSCode subprocesses) and I was using almost 10 GB of RAM! If you're doing so much number crunching that you need 400-something concurrent processes you need to find a Python library that manages its own thread pool (polars is my go-to, numpy might also work here?). Failing that, you can rewrite your program to import arcpy after every future has resolved. This cuts the size of each runtime process from ~200MB to ~35MB on my machine, which kept my process below 8GB (and it'll probably run faster overall as you're writing all the data in a single cursor).

RyJohnsen · ‎03-25-2025

I already rewrote the nested portion to use shapely instead of arcpy and that improved both memory usage and speed.

I need to output the results of each future as they complete to avoid a scenario where the script is running for multiple days and unexpectedly stops. I'm thinking of just outputting WKB to a file to avoid arcpy during this processing part. I just wanted to see if I was encountering an issue with arcpy, or if my approach was flawed.

I'll look into polars and numpy.

MarcoBoeringa · ‎03-21-2025

Can't help you with the specific error message that you receive, but a couple of remarks:

- If you are still running Windows 10, you may not be able to go beyond 64 processes due to limitations with Windows Processor Groups:

https://bitsum.com/general/the-64-core-threshold-processor-groups-and-windows/

It is still not entirely clear to me personally, if and how Windows 11 handles this and if it allows true unlimited process numbers. There were changes though to this aspect of Windows, but I am still on Windows 10 so can't verify it.

- Do you really need processes? If you connect to a database, the database may well turn a threaded application into something close to a processes based multi-processing solution, however allowing you to use threads in your Python application.

E.g. in the screenshot below, I am using a concurrent.futures.ThreadPoolExecutor to execute up to 44 threads with SQL statements to generalize data on the database using PostGIS commands. As you can see from the inset of the remote desktop to the server, this pushes the PostgreSQL database to a full 100% CPU usage, without processes.

RyJohnsen · ‎03-25-2025

I'm on windows 11. I initially used TreadPoolExecutor, but I'm not I/O bound and the GIL caused it to be much slower than ProcessPoolExecutor.

MarcoBoeringa · ‎03-25-2025

@RyJohnsen wrote:
I'm on windows 11. I initially used TreadPoolExecutor, but I'm not I/O bound and the GIL caused it to be much slower than ProcessPoolExecutor.

Yes, it all depends on how much actual CPU work versus IO you are doing whether ThreadPoolExecutor or ProcessPoolExecutor is the best solution (and available resources in terms of RAM etc., although that starts to become a pretty mood discussion on modern power desktops, that usually have plenty of resources).

However, in my experience, it is pretty hard to become purely CPU and not IO bound. You really need to do significant work to be CPU bound or limited.

MarcoBoeringa · ‎03-25-2025

Actually, for Python, the maximum number of workers that can be launched on Windows is 61 according to the documentation for concurrent.futures: https://docs.python.org/3/library/concurrent.futures.html

By the way, if you don't want irritating pop-ups of command windows during execution of processes, you can add the code below, which will change the python executable and prevent pop-ups and execute like if it was a thread (although process):

import sys

multiprocessing.set_executable(os.path.join(sys.exec_prefix,'pythonw.exe'))

RyJohnsen · ‎03-26-2025

running it with max_concurrency of 6 and max_children of 10 for a max of 60 processes caused the same issue, so that doesn't seem to be the cause.