I've been using the arcpy.management.ImportXMLWorkspaceDocument(...) function to try and formalize an old schema and ran into a really fun issue with how the import happens.
As it stands, the schema import is linear. As in it will import each feature in the order they appear in the document. However, features can have associated attribute rules that are imported at the same time.
The issue here is that it's not too uncommon to have an attribute rule that accesses another feature using the FeatureSetByName(...) function of Arcade. When this happens, the Arcade parser will validate the script by checking the now partially formed database for that feature and any requested fields.
As you can see, this will fail if the relationships are say circular or the rules access features that have not yet been created meaning the schema import hard fails.
The core Idea would be to either defer attribute rule creation until *after* the entire schema has been created, or offer the user the ability to skip importing rules entirely and make attribute rule application a secondary function. You could also disable the validation on the backend during a schema import and trust the user to validate rule functionality afterwards.
Here's a simple example of a schema configuration that cannot be imported:
{
"datasets": [
{
"name": "FC1",
"attributeRules": [
{
"name": "Rule 1",
"scriptExpression": "var fc2 = FeatureSetByName($datastore, 'FC2')"
}
]
},
{
"name": "FC2",
"attributeRules": [
{
"name": "Rule 2",
"scriptExpression": "var fc1 = FeatureSetByName($datastore, 'FC1')"
}
]
}
]
}
Since FC1 has a defined rule that relies on FC2 exisiting, the import will immediately fail when FC1 is created because Rule 1 fails validation.
Edit:
I have implemented a workaround in the helper library I use. It leans heavily on other classes and functions I've written, but this properly accounts for the broken parts of the ImportXMLWorkspaceDocument function:
class Dataset:
...
@classmethod
def from_schema(cls, schema: Path|str, out_loc: Path|str, gdb_name: str) -> Dataset:
schema = Path(schema)
out_loc = Path(out_loc)
if (out_loc / gdb_name).exists():
raise OSError(f'{gdb_name} already exists in target directory!')
if schema.suffix not in ('.xml', '.json', '.xlsx'):
raise ValueError(
f'Invlaid schema type {schema.suffix}, '
'only xlsx, json, and xml documents can be imported'
)
# Convert the schema to json for easy parsing of attribute rules
with TemporaryDirectory(f'{gdb_name}_json_schema') as tmp:
tmp = Path(tmp)
_gdb = str((out_loc / gdb_name).with_suffix('.gdb'))
# Convert the schema to json
_res, = ConvertSchemaReport(str(schema), str(tmp), 'json_schema', 'JSON')
# Load in the schema
workspace: SchemaWorkspace = json.loads(Path(_res).read_text(encoding='utf-8'))
# Get all FCs on all levels
features: list[SchemaDataset] = []
for ds in workspace['datasets']:
if 'datasets' in ds:
features.extend(ds['datasets'])
else:
features.append(ds)
# Use these to re-write any attribute rules
guid_map = {fc['catalogID']: fc['name'] for fc in features if 'catalogID' in fc}
# Extract rules and repair GUID interpolation from export
for ds in workspace['datasets']:
if 'datasets' in ds:
features = ds['datasets']
else:
features = [ds]
for fc in features:
if 'attributeRules' not in fc:
continue
rules = fc['attributeRules']
rule_dir = (tmp / 'rules' / fc['name'])
rule_dir.mkdir(exist_ok=True, parents=True)
for rule in rules:
# Repair the script (use the common name and not catalogID)
script = rule['scriptExpression']
for guid, name in guid_map.items():
script = script.replace(guid, name)
rule.pop('scriptExpression')
# Dump the rule
(rule_dir / rule['name']).with_suffix('.js').write_text(script, encoding='utf-8')
(rule_dir / rule['name']).with_suffix('.cfg').write_text(json.dumps(rule), encoding='utf-8')
# Delete all rules from the schema so it won't fail
fc['attributeRules'] = []
_new = (tmp / '_patch_schema').with_suffix('.json')
_new.write_text(json.dumps(workspace))
# Convert to importable XML
_xml, = ConvertSchemaReport(str(_new), str(tmp), 'schema', 'XML')
# Create a new GDB
CreateFileGDB(
out_folder_path=str(out_loc),
out_name=gdb_name,
out_version='CURRENT'
)
# Build base Schema
ImportXMLWorkspaceDocument(str(_gdb), _xml, 'SCHEMA_ONLY')
# Re-Construct the rules
ds = Dataset((out_loc / gdb_name).with_suffix('.gdb'))
for feature_class in ds.feature_classes.values():
if not (tmp / 'rules' / feature_class.name).exists():
continue
try:
feature_class.attribute_rules.import_rules(tmp / 'rules' / feature_class.name)
except Exception as e:
print(f'Failed to import rules for {feature_class.name}: {e}')
return ds
As you can see, I need to manually update every rule to replace the GUID of the features with the class name. That also errors out on rule import since the AddAtrributeRule function is so incredibly strict. This does still error out if you have any rules that expect a table to have values, which you should probably have a condition for, but it's still annoying that I can't add a rule that I know will work when the database is in use, but not while it's being generated.
Thanks for posting this. Can you share what version of ArcGIS Pro you are encountering this issue in? The simple reproduction case you gave above does not reproduce for me in ArcGIS Pro 3.5.
@HaydenWelch
Thanks for sharing your Idea.
I've spent some time talking with the Geodatabase team and gotten to the bottom of this issue. The root of the issue is in Generate Schema Report. Generate Schema Report is outputting attribute rule class references as GUID's instead of plain text names, which causes this issue on round trip of the report. These GUID references are not exposed through other export methods, such as Export XML Workspace Document and as such are prone to cause issues as you have described.
We have opened an internal issue for this bug for development in the near-term that will address the problem. Luckily, I am the point of contact for schema report so you got directly to the person you needed to.
Thank you for taking the time to log this Idea. Since this is a bug, I will close the issue for kudos. I will make sure to share updates here when I have more specific information about what release a fix will be included in.
@SSWoodward Thanks for looking into this for me! Glad I was able to get to the right person. I'll probably have more of these as I continue work on my library as I'm digging pretty deeply into a lot of different aspects of how data flow happens with arcpy and building workarounds for most little nuisances like this.
Having them fixed would be great, but in the meantime I'll keep implementing my own solutions and opening issues here when I feel that they're bad enough to require work on your end.
You can follow my work over on GitHub here: https://github.com/hwelch-fle/arcpie
If you have any interest!
I gave it a gander. It's great stuff 🙂 Thanks again!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.