Hello,
I am attempting to find a quicker way to identify Parcels that update on a monthly basis. For some backstory, I work at a local government who doesn't manage our Parcels in-house and get a monthly update from the County.
I would like assistance in identifying a workflow that will use the APN (not ObjectID) to compare the feature with the newly downloaded datasets. This will be used to show changes in a particular feature whether that be any of the following, new APNs, missing APNs, or geometry changes in identical APNs
Have you taken a look at this tool in Pro? https://pro.arcgis.com/en/pro-app/latest/tool-reference/data-management/feature-compare.htm
It may help you obtain the information that you are looking for.
@George_Thompson Yes, I have and initially was excited for this as on option. However, I am having troubles with this, as it returns the comparison based on the ObjectID. Because the ObjectID is not maintained throughout the process of integrating the dataset with my own, I have not been able to utilize this.
Looking back at how the data is received, it also does not maintain ObjectIDs throughout the same parcel from consecutive updates.
I've had good experiences with leveraging an SHA-1 hash of the non-key important data*, storing the hashes with the actual primary key in a parallel table. Due to the nature of the data, I've sometimes converted the data to only a few significant digits, then hashed that (to enable "fuzzy matching", when 3.71 -> 3.72 or 3.68 is not significant, but 8.23 is). Recently I've stored the geometry and non-geometry hashes separately, and have even been able to match rows without a useful rowid key by inverting the hash dictionary spanning tens of millions of features, so that hashes could resolve feature matching at 85-97% accuracy.
Unfortunately, I can't release my code without approval. And in order to make this work , you need to
be really comfortable with endian integer and floating-pint storage representations. I can say the core
code is a set of encoder functions (one for each supported type), which is stored as a bytearray sequence, that is fed into a core Python hashing function. Use of SHA-1 can generate safety warnings, but using it
for hashing (not cryptographic) purposes is perfectly safe.
- V
---
*Unimportant data would include a URL that incorporates the primary key and another column, which could be recomputed (or changed at will)
Are you looking to update a local layer of parcels with the changes or really quickly produce a table of APN's that have changed between the previous dataset and the new dataset that your stakeholders can reference?