Select to view content in your preferred language

Sync the string literals for data objects and thier mutator functions

80
3
Friday
Status: Open
HaydenWelch
MVP Regular Contributor

There's an irritating de-sync that has happened between object class attributes and object mutation/creation functions in arcpy. Many accessible objects have attributes that will take a string literal flag that also exists as an instance attribute of the returned type, but the strings are different for access and creation.

 

For example, a Domain object is defined this way:

class Domain:
    codedValues: dict[ValueType, str]
    description: str
    domainType: Literal["CodedValue", "Range"]
    mergePolicy: Literal["AreaWeighted", "DefaultValue", "SumValues"]
    name: str
    owner: str
    range: tuple[ValueType, ValueType]
    splitPolicy: Literal["DefaultValue", "Duplicate", "GeometryRatio"]
    type: DomainFieldType

 

While the CreateDomain function expects arguments to be formatted this way:

def CreateDomain(
    in_workspace: Unknown | None = None,
    domain_name: Unknown | None = None,
    domain_description: Unknown | None = None,
    field_type: Literal['SHORT', 'LONG', 'BIGINTEGER', 'FLOAT', 'DOUBLE', 'TEXT', 'DATE', 'DATEONLY', 'TIMEONLY'] | None = None,
    domain_type: Literal['CODED', 'RANGE'] | None = None,
    split_policy: Literal['DEFAULT', 'DUPLICATE', 'GEOMETRY_RATIO'] | None = None,
    merge_policy: Literal['DEFAULT', 'SUM_VALUES', 'AREA_WEIGHTED'] | None = None
) -> Result1[str]

 

This means that "copying" attributes from a Domain object to another place required a ton of manual attribute mapping (CodedValue -> CODED, etc.)

 

I know this is because the internal schema of the CIM is represented using the strings that are shown in the class definition, but is there a good reason to not allow either string value to be used in the function call? Ideally, these flags would accept either representation of the field state, or be typed in a way that allows either string flag to be used since there is no real overlap between the two.

3 Comments
JoshuaBixby

Unfortunately, Python doesn't support parameter name aliases like some other languages because that would allow for a fairly straightforward way to address it.

HaydenWelch

@JoshuaBixby I'm not worried about the parameter names, you can just *args into them as long as they're ordered properly. The bigger thing is the accepted literals for each parameter. There can be up to 10 flags that are different on both ends of the object lifecycle meaning I need at least 10 conditions *per parameter* for mapping flag values. There's literally no need for this as far as I can tell since both come from and go to the same C code on the backend, so why is there mangling of flags happening during that process?

 

Easiest solution here would be to standardize on the CIM/SDK flag values that are returned by the object getters. Then allow the old flags to be passed, but just don't hardcode them in a Literal so you get a linter warning using the old flags. That way nothing existing breaks, but the system is a lot more bi-directional and it nudges people who actually check type hints in the right direction.

JoshuaBixby

For parameters names that are the same, e.g., mergePolicy and merge_policy, I understand how the difference comes about since Python's best practice for naming variables uses snake case while camel case is the convention with some other languages like C, C++, etc...  Where things get dicey for me is when the name itself is different, e.g., description and domain_description.  Even if one converts snake case domain_description to camel case domainDescription, they aren't the same name although they refer to the same item.  At least if the names were exactly the same and it was just a snake case versus camel case issue, it would be more straightforward to translate.

Unfortunately, the discrepancy with handling parameter names propagates down into the accepted literals, which exacerbates the situation.