Database Auto-Population Using Python

JD1016 · ‎10-05-2023

Hello,
I've attached two example tables to this posting that I hope will illustrate my question.
The first, TXDATABASE..xlsx, is what you can think of as a "destination" table whereby the records under the fields SVOLTAGE and PVOLTAGE will be populated based on matching information contained within the second, TXSOURCEDB.xlsx, which you can think of as the "source" table.
I am not a writer of python, or any other programming language for that matter, but I'm hoping someone out there can help me solve this using code.
Is there coding that would take the values found under "NUMBER", "KVA", and "TYPE" from my source table, match them against what they see in the destinations table, and auto-fill SVOLTAGE and PVOLTAGE into the destination table with what is contained in the source table? It's important that all three (NUMBER, KVA, and TYPE) match because there are duplicate NUMBER'S in the field and it will require the other two to match so there is no doubt of what should be populated in SVOLTAGE and PVOLTAGE.
I hope that makes some degree of sense.
Thank you.
Jeff

MobiusSnake · ‎10-05-2023

I do this kind of thing pretty regularly, here's some code that you could run from the Python window in Pro to do what you need. I've put in a bunch of comments to explain what each part does, you'll need to adjust paths to the data, possibly field names, but hopefully it gives you a good start:

# Create a list of fields you want to use.  This is an easy
# case because both tables use the same fields.
field_list = ["NUMBER", "KVA", "TYPE", "SVOLTAGE", "PVOLTAGE"]

# This is a key:value lookup that will contains "keys" made up
# of the first three fields, pointing to values containing pairs
# of the last two fields.
voltage_lookup = dict()

# This will read the contents of the source table into the lookup.
with arcpy.da.SearchCursor(r"C:\Data\data.gdb\voltage_src", field_list) as sc:
    for row in sc:
        lookup_key = row[:3]
        lookup_val = row[3:]
        if lookup_key in voltage_lookup:
            # This presumably shouldn't happen but is a good data
            # quality check.
            print(f"Duplicate key found: {voltage_key}")
        else:
            voltage_lookup[lookup_key] = lookup_val

# This will read each record in the destination table, read the three-value
# key, consult the lookup, and update the last two fields appropriately.
with arcpy.da.UpdateCursor(r"C:\Data\data.gdb\voltage_dest", field_list) as uc:
    for row in uc:
        lookup_key = tuple(row[:3])

        # This is a quick check to see if we have a record in our destination
        # table that isn't in the source table.
        if lookup_key not in voltage_lookup:
            continue
        
        # Do the lookup and post the update.
        lookup_val = voltage_lookup[lookup_key]
        update = lookup_key + lookup_val
        uc.updateRow(update)

View solution in original post

MobiusSnake · ‎10-05-2023

I do this kind of thing pretty regularly, here's some code that you could run from the Python window in Pro to do what you need. I've put in a bunch of comments to explain what each part does, you'll need to adjust paths to the data, possibly field names, but hopefully it gives you a good start:

# Create a list of fields you want to use.  This is an easy
# case because both tables use the same fields.
field_list = ["NUMBER", "KVA", "TYPE", "SVOLTAGE", "PVOLTAGE"]

# This is a key:value lookup that will contains "keys" made up
# of the first three fields, pointing to values containing pairs
# of the last two fields.
voltage_lookup = dict()

# This will read the contents of the source table into the lookup.
with arcpy.da.SearchCursor(r"C:\Data\data.gdb\voltage_src", field_list) as sc:
    for row in sc:
        lookup_key = row[:3]
        lookup_val = row[3:]
        if lookup_key in voltage_lookup:
            # This presumably shouldn't happen but is a good data
            # quality check.
            print(f"Duplicate key found: {voltage_key}")
        else:
            voltage_lookup[lookup_key] = lookup_val

# This will read each record in the destination table, read the three-value
# key, consult the lookup, and update the last two fields appropriately.
with arcpy.da.UpdateCursor(r"C:\Data\data.gdb\voltage_dest", field_list) as uc:
    for row in uc:
        lookup_key = tuple(row[:3])

        # This is a quick check to see if we have a record in our destination
        # table that isn't in the source table.
        if lookup_key not in voltage_lookup:
            continue
        
        # Do the lookup and post the update.
        lookup_val = voltage_lookup[lookup_key]
        update = lookup_key + lookup_val
        uc.updateRow(update)

JD1016 · ‎10-05-2023

Outstanding! Thank you so much for taking the time in responding to my post. It provides an exceptional start for me.

Jeff

RobertKrisher · ‎10-05-2023

There's a few optimizations that could be made to the script.

First, instead of doing a "not in" check then getting the value, just use the .get method on the dictionary and return a default value of none and skip the row if you get the default value. This allows you to check for existence and get the value in a single call.

Another optimization would be to check whether the existing value on the row already matches the lookup value. If the row already has the correct value you can then continue onto the next row without calling update on the cursor. In situations where you're going to be running this script repeatedly against the same table and only populating a handful of new rows each day this is a huge boost to performance (since you're not having to incur the cost to update/store each row when no value has changed).

JD1016 · ‎10-05-2023

Just wanted to drop a follow-up line. This code works beautifully and the only modification that I made was in the file path for this initial test. The fields will probably be expanded, but, as you indicated, that can be easily done as well.

Thank you again and I hope someone else may benefit from your valuable assistance.

Jeff