Select to view content in your preferred language

Add @functools.lru_cache decorator to utils.fetchautocomplete

220
0
02-23-2025 11:04 AM
Status: Open
HaydenWelch
MVP Regular Contributor

Issue Description

I have been working on re-writing a lot of the utils module in arcpy, as the code in there is an absolute mess and is part of the reason the Python Window in Pro is so terrible.

As I was going through it, I noticed that there is a `fetchautocomplete` function that seems to be called repeatedly everytime the cursor changes position to search the stack frame for autocomplete options that are handed to the renderer in ArcPro to build the auto complete popup menu.

In my testing, this function can take up to 200ms to run and that 200ms cost is paid every time the user types a '.' and it has to traverse the entire global stack for possible autocomplete options.

One Line Fix

The `utils` module already uses functools everywhere. In functools, there is this amazing decorator called functools.lru_cache. This decorator will cache function calls and their return values so if a function is called with the same arguments more than once, the cached value is used. Here's the proposed change:

 

 

@functools.lru_cache(maxsize=128)
def fetchautocomplete(code, string_index):
    """Function for ArcGIS desktop - fetches autocomplete candidates for
       python window"""
    ...

 

 

This will cache the 128 most recent strings passed to this function and their responses. That means the 200ms cost of searching autocompletes for `arcpy.` is only paid once. All subsequent `arcpy.` calls execute in ~100ns (a 100000000x speedup).

 

>>> %timeit fetchautocomplete('arcpy.', 6)
121 ms ± 1.1 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

# With lru_cache
>>> %timeit fetchautocomplete('arcpy.', 6)
134 ns ± 0.581 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

 

 

There are other functions in here that are constantly called in the window and can sometimes take up to 10s to execute, but those are problems for another day. For now I think getting a 100000000x performance boost on a frequently used function with one line of code and no additional imports is pretty good.

 

Note:

In my testing the largest speedup was when a function was called that traverses the current project datasources, e.g. arcpy.da.SearchCursor. First run in a large project took 15s, subsequent calls took 100ns. Without the cache, each time you typed `arcpy.da.SearchCursor(`, the 15s cost was paid.

This optimization will only work if the code string is exact currently as the entire contents of the REPL are handed to the `fetchautocomplete` function. In order to make this a general speedup, a custom cache would need to be created that associates the actual identifier with its autocomplete options. That way `*da.SearchCursor(` would be cached:

 

 

 

# cached
arcpy.da.SearchCursor(

# Different cache entry for
with arcpy.da.SearchCursor(

 

 

 

The way the code is currently written, the full text of the input string and the index of the cursor is passed to each subsequent function. If instead, only the relevant part of the string was passed to each function and each of them maintained their own cache, the performance of the Python Window would increase with use. This is a bit out of scope of this Idea, as it would require a total re-write of the parsing functions in the utils module (something I will probably get around to eventually).

 

Note2: This is the current function call chain for the autocomplete parsing

HaydenWelch_0-1740341617088.png

Each of these functions is parsing the entire contents of the REPL

 

Note3:

There is a bit of an issue with arcpy functions that list the current map layers when you cache.

This was my biggest issue as again, for large projects this can take up to 30 seconds to load (and it blocks the main thread meaning you literally can't interact with Pro while it's trying to find out what layers are in your map).

The issue is that if you switch maps with an active layer cache, the layers won't update to the new map layers. Honestly, I prefer out of date layer lists to excruciatingly unresponsive load times that freeze the program, but I can see how this would cause confusion for users who don't think about that. The quick solution would be to remove layer autocomplete and just allow the user to drag the layer into the function. Other than that the whole layer autocomplete process would likely need to be re-worked. I can't even tell where it's happening in the original code...