Problem: Datasets have different domains that mean the same thing
Example: Street Suffix' can be provided in different formats when they need to be joined you have to substitute the data manually
Ie: Crt, Ct, Court
Solution: Have a thesaurus/catalog type option where if the data contains part of these suffix to be joined in the record it recognises its similar and joins
A quick suggestion on the title of this one - it would better if it were "Join similar" rather than "Append similar"
I got some input from a colleague on this one, and something that could work in this situation would be to generate a translation table:
Oid, desc, code
1, court, 1
2, ct, 1
3, crt, 1
4, street, 2
5, st, 2
6, ave, 3
Etc
And now 3 tiers of joins.
Join Origin table (street suffix) to translation table ‘desc’ == join1
Join Destination table (street suffix) to translation table ‘desc’ == join2
Then join1.code == join2.code == Final result (hide the fields you don’t want to see).
Also if they are really different domains, then the key is the code stored in the data, not the description. If the codes are different, then you need the pattern described, if the codes are the same but the descriptions are different, then no problem the join will work at the db level.
Hope this helps!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.