Calculate Field Tool Does not Complete - Help

3830
33
Jump to solution
04-10-2013 07:42 AM
ModernElectric
Frequent Contributor
I need help really bad. I have been looking through all of the Forum posts and and Help sections to try to figure out what is wrong with no luck. Here is what I am doing:
    I have a File Geodatabase (ArcMap 10.0) with a Feature Class (Service Location) which is the physical location of our electric meters. I have multiple fields that I need to update on a nightly basis. This data comes from our CIS system which holds account, customer and meter attribute information. I have a script to runs automatically every night that pulls an Oracle VW through an ODBC connection and brings it into my File Geodatabase as a Geodatabase Table. What I have to do is to update a specific field (Account Number) in the Feature Class with the same field in the table (ACCOUNTNUM)... Just an example. What I want to be able to do is to also have this script run automatically at night. I have around 9500 records in the feature class with about 15 different fields to do this to. The problem I am having is it takes up to 20 to 25 minutes for each field to update so at times - the script can take 8 to 9 hours to complete. From what I understand - this should take only a few minutes.... So what I am doing is not working.
I decided to go back to the "Drawing Board" to see if there is something I am missing with writing the code..... I decided to work with (1) field right now to see if I can make it work and get the update time down before I rewrite the rest of the code.

This is what I currently have and it does not seem to be working:

## Set the necessary product code import logging  # Import arcpy module import arcpy from arcpy import env   # if run in ArcGIS Desktop, show messages, also print to log def log(method, msg):     print msg     method(msg)  logging.basicConfig(         level=logging.INFO,         format='%(asctime)s %(levelname)-s %(message)s',         datefmt='%a, %d %b %Y %H:%M:%S',         filename='Test_Update.log',         filemode='w'         )  log(logging.info, "Updating Service Locations Feature Class Begins")  # Set environment settings env.workspace = "C:\\MEWCo GIS System\\Electric System\\MEWCo_Electric_Model-LOCAL.gdb"  # Set Local variables: inFeatures = "SERVICE_LOCATION"                         # Service Location Feature Class layerName = "SERVICE_LOCATION_LAYER"                    # Service Location Feature Layer fieldName1 = "SERVICE_ADDRESS" expression1 = "CIS_Account_Data.SERVICE_ADDRESS" joinField = "POLE_NUMBER" joinField2 = "MAPNO" joinTable = "CIS_Account_Data"  # Create a Feature Layer from Service_Location Feature Class log(logging.info, "Create Feature Layer for Service Location Feature Class") arcpy.MakeFeatureLayer_management(inFeatures, layerName)  # Join the feature layer to a table log(logging.info, "Join Table to Feature Layer Service Location") arcpy.AddJoin_management(layerName, joinField, joinTable, joinField2)      # Process: Calculate Field (Member Number) log(logging.info, "Update Member Number Field") arcpy.CalculateField_management(layerName, "SERVICE_LOCATION.MEMBER_NUMBER", '!MEMBERNO!', "PYTHON")  log(logging.info, "Complete")


I am not sure what is slowing this down. Am I looking at this process the wrong way?? Are there different tools that can be used to complete this mission??

Thank you for your help
Tags (2)
0 Kudos
33 Replies
by Anonymous User
Not applicable
Caleb:

Is there any reason why Chris' original script should take so long to run?

I am working with a much larger feature class and table performing the same type of join and calculate geoprocess and it completes in 5 minutes.

Is there any chance it could be due to field type differences between the join field in the table and the field that is being calculated in the feature class?


Hi Michael,

I am not sure what the issues were.  I guess I should have clarified that I was having slow performance on very large data sets and I was running several different field calculations.  I just know that the processing time went from about 20 minutes with the join and calculate method down to under 2 minutes with AttributeUpdate() function I posted above. 

I can do a speed test real quick and post the times for each process.
0 Kudos
MichaelVolz
Esteemed Contributor
Caleb:

Your solution is very elegant compared to the script that Chris wrote and I helped him modify.  I cannot speak for Chris, but your script is advanced and might be difficult for a person new to python scripting.

It just seems that Chris' script has a minor issue that can be quickly corrected (once its found) as he is working with such a small dataset.
0 Kudos
ModernElectric
Frequent Contributor
Caleb/Michael-

All is correct. What I am doing is simple ArcGIS Python scripting 101.... When it comes to the talk of dictionaries...indexes...etc (Advanced Python scripting) I am lost......

There has to be an easier way for a rookie python scripter like myself to be able to program a process that takes Field "Member Number" from a file Geodatabase table and copy it to a field in a file geodatabase "member number" after it makes a join.

Would it be easier:
   Add Join (Table to Feature Class) - using Map Number/MAPNO as the join attribute
   Export to a new Feature Class
   Calculate Field(s)
   Delete fields that use to be the table
   Rename feature class back to the original name

Is that overkill or would to work???


Caleb:

Your solution is very elegant compared to the script that Chris wrote and I helped him modify.  I cannot speak for Chris, but your script is advanced and might be difficult for a person new to python scripting.

It just seems that Chris' script has a minor issue that can be quickly corrected (once its found) as he is working with such a small dataset.
0 Kudos
MichaelVolz
Esteemed Contributor
Chris:

With your current script without the data dictionary, can you verify that the data type is the same in the original field of the feature class and the source field you are calculating from in the table?

Like I said before, I have a script that works like yours on a much larger dataset and it finishes in less than 5 minutes.
0 Kudos
ModernElectric
Frequent Contributor
Caleb-

   I got your process/script to WORK.......It took about 15 seconds for the MEMBER_NUMBER field to update from the CIS_Account_data table....

Now - I know I have alot of work to do since I have about 20 fields in my electric model and about 15 in my water model to have this done too...

What would be my next step so this works for all of my fields

Example:

Customer name:
    Feature Class - CUSTOMER_NAME
    Table - CUSTNAME

Thank you very much!!!!!
0 Kudos
ModernElectric
Frequent Contributor
Caleb and Michael-

   I got it all to work. The script(s) with the dictionaries did EXACTLY what I wanted it to so with almost NO time at all. Now its just going to take me awhile to program the fields into the script and run an overnight test......

Next step - reading up on dictionaries and indexes so I can attempt to understand this large script that Caleb sent me..

Thank you all (Caleb & Michael) very much for all your help and input.. Think my blood pressure will come down now too...
0 Kudos
MichaelVolz
Esteemed Contributor
Most of the thanks goes to Caleb.

Although my script runs in a reasonable timeframe, I can use Caleb's code to make my script even better.  Thank you very much Caleb.
0 Kudos
by Anonymous User
Not applicable
@ Chris:

I am glad you were able to get it to work.  As for having it do more fields you just need to make a list of fields to updated in table A and a list of the fields from table B that contain the values.  However, you must make sure to keep the fields in the correct order for each list. For example:

list_A = ['field_1', 'field_2', field_3', etc...]  # for table A
list_B = ['field_A', 'field_B', field_C', etc...]  # for table B


I have never tested this on more than 7 or so fields at a time but I am sure that it will work the same for 20.

Now I will try to explain the dictionary usage:

Most of my script is error handling, but the main meat of the script that is actually updating the table is this part:

                # Create Dictionary
                path_dict = {}
                srows = arcpy.SearchCursor(join_table)
                for srow in srows:
                    keyrow = srow.getValue(join_key)
                    valrow = srow.getValue(k)
                    path_dict[keyrow] = valrow
                
                # Update Cursor
                urows = arcpy.UpdateCursor(source_table)
                for row in urows:
                    upkey = row.getValue(in_field)
                    try:
                        if upkey in path_dict:
                            row.setValue(v, path_dict[upkey])
                            urows.updateRow(row)
                    except:
                        pass  # skip nulls


So a dictionary is just made up of pairs consisting of keys and values like this { key : value}.  You can look up a value by supplying the key...Values can be any value (even objects and entire lists).  The above portion uses a search cursor on table B (with the values) to make a dictionary using each record in the join field (your common field) as a key.  The corresponding value will be one of the fields from the "join_fields" parameter.  It creates a separate dictionary for each one.

Next, an update cursor is used to search through the table and if the value in the join_field in table A is a key in the dictionary, it will update the corresponding "update_field" for that row.  I have an exception set up so that it will skip nulls when working through shapefiles.  I hope this makes sense.


@ Michael

I had not tested the field calculator in arcpy land since I upgraded to 10.1.  However, after a quick test it seems that it runs much more efficiently now than it did in the past.  I have since gotten a new computer so maybe that is why it was so slow in the past.  I am glad you find this useful.  I have another function from the same script that copies fields from one table to another with the same schema so you can avoid the "field mapping" type operation like in the AttributeUpdate().  I can post that if anyone is interested.
0 Kudos
ModernElectric
Frequent Contributor
Caleb-

  Yes - some of that makes sense (Use of Dictionaries)... I plan on doing some more research to see if I can use this with other
things/ideas with my other geodatabases. I work with electric and water utility infrastructures and been trying to develop some scripts/processes so I can integrate other parts of our company to GIS (CIS data, SCADA, TWACs, ETC).

Again..thank you very much.... This is going to cut down ALOT of time in updating my feature classes with tables from our CIS Oracle database

Chris

@ Chris:

I am glad you were able to get it to work.  As for having it do more fields you just need to make a list of fields to updated in table A and a list of the fields from table B that contain the values.  However, you must make sure to keep the fields in the correct order for each list. For example:

list_A = ['field_1', 'field_2', field_3', etc...]  # for table A
list_B = ['field_A', 'field_B', field_C', etc...]  # for table B


I have never tested this on more than 7 or so fields at a time but I am sure that it will work the same for 20.

Now I will try to explain the dictionary usage:

Most of my script is error handling, but the main meat of the script that is actually updating the table is this part:

                # Create Dictionary
                path_dict = {}
                srows = arcpy.SearchCursor(join_table)
                for srow in srows:
                    keyrow = srow.getValue(join_key)
                    valrow = srow.getValue(k)
                    path_dict[keyrow] = valrow
                
                # Update Cursor
                urows = arcpy.UpdateCursor(source_table)
                for row in urows:
                    upkey = row.getValue(in_field)
                    try:
                        if upkey in path_dict:
                            row.setValue(v, path_dict[upkey])
                            urows.updateRow(row)
                    except:
                        pass  # skip nulls


So a dictionary is just made up of pairs consisting of keys and values like this { key : value}.  You can look up a value by supplying the key...Values can be any value (even objects and entire lists).  The above portion uses a search cursor on table B (with the values) to make a dictionary using each record in the join field (your common field) as a key.  The corresponding value will be one of the fields from the "join_fields" parameter.  It creates a separate dictionary for each one.

Next, an update cursor is used to search through the table and if the value in the join_field in table A is a key in the dictionary, it will update the corresponding "update_field" for that row.  I have an exception set up so that it will skip nulls when working through shapefiles.  I hope this makes sense.


@ Michael

I had not tested the field calculator in arcpy land since I upgraded to 10.1.  However, after a quick test it seems that it runs much more efficiently now than it did in the past.  I have since gotten a new computer so maybe that is why it was so slow in the past.  I am glad you find this useful.  I have another function from the same script that copies fields from one table to another with the same schema so you can avoid the "field mapping" type operation like in the AttributeUpdate().  I can post that if anyone is interested.
0 Kudos
MichaelVolz
Esteemed Contributor
Chris and Caleb:

After further research I think I have found out why my script which performs the same action on a much larger dataset runs quickly.  I create an index on the field in the table that I am using to join to the feature class before I create the join.  I took the model from another person who built the model in modelbuilder and I exported it to a python script where I made some tweaks (That is why I was unaware of the index previously).

As such Chris, you could also add a arcpy.AddIndex_management step to your process in the CIS table which should dramatically speed up your original script (Only if you can not get Caleb's more advanced script to work for you).
0 Kudos