Python Tool instance parameters bug

HaydenWelch · ‎06-17-2024

I've noticed that self parameters in python toolboxes don't actually survive between calls to updateMessages and updateParameters

e.g.

import arcpy

class BrokenTool(object):
    def __init__(self) -> None:
        self.label = "Tool"
        self.description = "My Tool"
        self.category = "Useful Tools"
        
        self.paramA = None
        self.paramB = 1
        return
    
    def getParameterInfo(self) -> list:
        p1 = arcpy.Parameter(
            displayName="Parameter A",
            name="paramA",
            datatype="GPString",
            parameterType="Required",
            direction="Input"
        )
        return [p1]
    
    def updateParameters(self, parameters: list) -> None:
        self.paramA = "Hello World"
        parameters[0].value = "paramA set to Hello World"
        return
    
    def updateMessages(self, parameters: list) -> None:
        self.paramB = 2
        parameters[0].setWarningMessage("paramB set to 2")
        return
    
    def execute(self, parameters: list, messages: list) -> None:
        arcpy.AddMessage(f"{self.paramA=}, expected 'Hello World'")
        arcpy.AddMessage(f"{self.paramB=}, expected 2")
        return

Will show up in ArcPro when opened like this:

Showing that those assignments did in fact get run, but when you run the tool:

The instance attributes were never updated.

The odd thing is that if you hijack Python's default mutable assignment bug, then it can act as a workaround by storing values across calls in a mutable default list value:

import arcpy

class BrokenTool(object):
    def __init__(self) -> None:
        self.label = "Tool"
        self.description = "My Tool"
        self.category = "Useful Tools"
        
        self.paramA = None
        self.paramB = 1
        return
    
    def getParameterInfo(self) -> list:
        p1 = arcpy.Parameter(
            displayName="Parameter A",
            name="paramA",
            datatype="GPString",
            parameterType="Required",
            direction="Input"
        )
        return [p1]
    
    def updateParameters(self, parameters: list) -> None:
        self.paramA = "Hello World"
        self._memory_hack(pA=self.paramA)
        parameters[0].value = "paramA set to Hello World"
        return
    
    def updateMessages(self, parameters: list) -> None:
        self.paramB = 2
        self._memory_hack(pB=self.paramB)
        parameters[0].setWarningMessage("paramB set to 2")
        return
    
    def execute(self, parameters: list, messages: list) -> None:
        arcpy.AddMessage(f"{self.paramA=}, expected 'Hello World'")
        arcpy.AddMessage(f"{self.paramB=}, expected 2")
        arcpy.AddMessage("\nExecuting Memory Hack...")
        self.paramA, self.paramB = self._memory_hack()
        arcpy.AddMessage(f"{self.paramA=}, expected 'Hello World'")
        arcpy.AddMessage(f"{self.paramB=}, expected 2")
        return
    
    # This is disgusting, don't do this. It relies on a bug in Python's initialization of default arguments to store state because
    # Python toolboxes don't actually update self. variables in updateParameters and updateMessages calls
    def _memory_hack(self, pA=None, pB=None,
                     paramA=[None], paramB=[None]) -> tuple:
        """ DO NOT EVER DO THIS """
        if pA:
            paramA[0] = pA
        if pB:
            paramB[0] = pB
        return (paramA[0], paramB[0])

Will return:

I guess this means that the tool object is being constantly rebuilt by the interpreter. Is there a good internal reason for this? There are valid reasons for someone to want to store data in an instance attribute (e.g. an expensive data check or calculation that has to be done during the validation loop, but can pass those validated values on the the execute loop), so why is the Tool object being re-initialized over and over instead of either mutated or re-initialized as a carbon copy of the original object and its __dict__?

AlfredBaldenweck · ‎06-18-2024

Interesting!

Out of curiosity, what's your workflow that inspired you to try doing this? Like, why are you setting self parameters instead of just using the normal parameters list?

HaydenWelch · ‎06-18-2024

I've got a lot of Tools. My current workflow for toolbox building is dynamic, where the toolbox is compiled from a module dictionary on load time allowing me to quickly swap out tools in an active central toolbox.

This also allows me to track individual tools more effectively using Git (they all live in a single tool file that is imported and reloaded when the toolbox is refreshed).

I also tend to do a large amount of pre-processing on tool open (eg pulling featureclasses from a specific database and validating schemas or building a dictionary of feature values for reference in the parameter object on load. I'll occasionally drive parameter values with data from the project by hijacking the tool __init__).

In my current use case, I'm trying to build a tool that batch appends all features from a list of source databases into a target database (usually to leverage new Arcade rules in the target database). So I need to do a lot of schema checks on tool load to make sure the merge is possible given a set of merge parameters. These checks are very expensive (10-15 seconds per database), so being able to skip the schema dictionary construction on execute or maintain a cache of validate schema fingerprints in an instance attribute massively improves tool usability.

I might try extending the parameters class though and storing values in that instead if instance attributes are unstable/immutable by design. I don't want to write to global function header defaults if I can avoid it lol.

BlakeTerhune · ‎06-18-2024

Some of these topics would make some interesting blog articles, @HaydenWelch. Hopefully you'll consider sharing your methods with the world.

HaydenWelch · ‎06-18-2024

I've been working on cleaning up some of my dirty implementation and trying to get it packaged up on GitHub. I'm the sole dev currently though so my plates pretty full.

I have some basic framework stuff up already under pytframe and pytframe2 on GitHub, but it still needs a lot of cleanup and some more functionality added.

The main goal is to try and simplify rollout of production patches and new functionality for Python toolboxes. This system allows for hot fixes and users can either be working off a shared network folder or run a local repo for their toolboxes and pull down updates and switch branches through git commands.

I've been thinking about obscuring the Git commands behind a Tool in the toolbox, but haven't gotten around to it. Might try and implement that in pytframe2 tonight