Modifying Permanent Sort script by Chris Snyder

RichardFairhurst · ‎08-02-2010

Chris:

I am not sure how best to contact you directly, but I have been trying to modify a script you wrote and posted on the public scripts page about a year ago. Your script permanently sorts a feature class or table to a new output feature class or table.

I have been trying to modify the script to optionally add a valid additional field of type Long to the output that will store a copy of the original data source's OID field values. Your script by default drops the original OID values from the output. However, I need these OID values where I want to join the output back to the source, but the data source has no other unique ID than the OID field and I cannot alter the source data schema. I have some code that basically works if I can assume that all OID fields are actually named "OBJECTID", but I do not think that is the case.

In VBA there are methods that can query a data source to confirm it has an OID and then get the OID field's name (which tells me there is more than one way these fields can be named). Is there anything equivalent in Python? Or how would you approach checking these two aspects of a data source to add appropriate error checking routines to your Python script to handle a subroutine like the one I want? If the new subroutine works, I would always be able to optionally include a join field in my sorted output that would work with any source that has an OID value without ever having to modify the source schema.

Thanks for the script and I hope you can help me.

Rich

RichardFairhurst · ‎08-03-2010

I may have answered my initial question. I found that an output variable holding the describe information of my input feature class/table has a method called OIDFieldName that can extract the source OID field name. That should be all I need in order to verify that an input has an OID field and to make sure that I transfer information from it using its proper OID field name.

However, I have a follow up question about ModelBuilder tools in general. I noticed that when I use the tool Chris created for this script that whenever I choose an input feature class or table, an output feature class/table name is automatically created for the output parameter that appends "_sortrecords" to the original input name. It also makes sure that the output name is unique within the output workspace by appending a number to the name if necessary.

How is the output name parameter automatically being populated by the tool? I tried checking the script and every property tab for the tool, but I cannot see what is triggering this behavior. I also don't see any evidence of a IGPFunction asociated with the tool. I would like to know how to replicate this kind of behavior for other tools I might build or modify.

Anyone know how its done? Thanks.

Rich

ChrisSnyder · ‎08-03-2010

Preserving the original OID is a good idea! And the .oidfieldname property is the right one to use...

The output names are given the default toll name suffix by the Toolbox GUI seemingly automatically now in v9.3+ (the default naming is not handled in a controlled fashion by the script of tool parameter settings). It wasn't this way pre-v9.2, and depending on how you look at it, it can be a good or bad thing. Personally I don't like it, since it seems to override the "default" parameter values that you assign in the tool setup dialog.

RichardFairhurst · ‎08-03-2010

Chris:

Thanks for the answers. I can see your point about the auto fill-in behavior. It could be annoying since I do not really know the conditions that trigger it and can't control when I want it or don't.

I also should mention that I modified your script to support up to four (4) sort fields on the input source so that the sorting capabilities are equivalent to an Advanced Table Sort on a Tableview in 9.3. That makes the tool more consistent and complimentary with the Desktop interface.

I do not want to publiish my version on the Scripts page, since my version really is mostly your code. I would like to encourage you to publish a revision of your script to do these things for everyone interested. If you think it would save you any time, I can forward you my version, but I am sure you can incorporate these revisions as well or better than I have. However, either way, it would be nice if you could review my script modifications, since you understand what the overall script is doing and could provide me with some valuable feedback on whether I am using Python to its full potential. I am just beginning to transition from VBA to Python and I really don't know the Python language or coding conventions that well.

Rich

ChrisSnyder · ‎08-03-2010

Rich,

I will post an update to acrscripts in the next few days. So as I understand, the requirements are:

1. The user can select up to four fields to sort.
2. The user can specify a new field name that will contain the original OID prior to sorting.

Sound Good?

RichardFairhurst · ‎08-03-2010

Rich,

I will post an update to acrscripts in the next few days. So as I understand, the requirements are:

1. The user can select up to four fields to sort.
2. The user can specify a new field name that will contain the original OID prior to sorting.

Sound Good?

Exactly. Sounds great.

I will be interested in comparing your final code to mine. I expect I will learn a few more things I didn't know about Python. Thanks for taking this on. Your original tool was cool, but this new version will be way cool! :cool:

Rich

RichardFairhurst · ‎08-03-2010

I guess the old ArcScripts site is now uneditable - Damn! Until I get a CodeGallery thing going, here's the new script/toolbox. Didn't do much extensivee bug testing - Let me know if you find any issues...

I guess our posts just missed each other. I tried to open your zip file to WinZip using IE8 and get an error message. I tried Mozilla Firefox instead and that worked fine, for anyone else trying to download your file. (I guess Microsoft is trying to protect me from myself again. 😛 )

RichardFairhurst · ‎08-03-2010

Didn't do much extensivee bug testing - Let me know if you find any issues...

Chris:

The new script did what I requested, but I noticed you did not rewrite the portion of your error checking tests to ensure that duplicate field names were not entered into multiple Sort Field choices so that the test accounted for the new sort field parameter.

I rewrote the section beginning at line 133 as follows to both check for all possible field duplications and to brake it down into separate field tests to keep it more readable:

    #make sure the same field isn't trying to be sorted twice
    if (sortFieldName4 not in ["","#"," "] and (sortFieldName4 in [sortFieldName1,sortFieldName2,sortFieldName3])):
        message = "ERROR: Duplicate field name in Sort Field #4.  Can't sort the same field twice! Exiting script..."; showPyError(); sys.exit()
    if (sortFieldName3 not in ["","#"," "] and (sortFieldName3 in [sortFieldName1,sortFieldName2])):
        message = "ERROR: Duplicate field name in Sort Field #3.  Can't sort the same field twice! Exiting script..."; showPyError(); sys.exit()
    if (sortFieldName1 == sortFieldName2):
        message = "ERROR: Duplicate field name in Sort Field #2.  Can't sort the same field twice! Exiting script..."; showPyError(); sys.exit()

I noticed that when I just wanted to sort just two fields that I as long as I put the first sort field in the Sort Field #1 parameter I could add the second sort field to any of the other 3 sort field parameters (as long as I specified the sort order that paired with the sort field parameter I chose) and the script worked the same. That is nice. Also, thanks for the quick modification.

I am still trying to understand the way the nested try and if block works in the section where you transferred the OID values. Your code worked, but based on the comment above that section I would have thought that it would have thrown an error when the OIDFieldName tried to calculate to the output's true OIDFieldName and that the user provide OID field would not have been populated. Obviously I made a wrong assumption somewhere, but where did I go wrong? When both tables have a matching OID field name and the setvalue matches the names and attempts to overwrite the inserted record's true OID value, does that not cause an error? Did I miss that somehow the code does not actually try to overwrite a true OID value in the output? Please provide me with a little insight. (Faulty assumptions like that can sometimes stick in my thinking and cause me to write unneeded code to work around a non-problem 😮 .)

Thanks again.

Rich

ChrisSnyder · ‎08-04-2010

Great - thanks for the error checking part - I will add that to my version.

The part where the OIDs get populated: The deal is that you can't populate some fields in a fc/table like OID, Length, Area, etc. That is what the "try" is for: Loop through the fieldList, and for each field, try to populate that field (but we can't for OID, Shape, etc). However, we can popualte the ORIG_OID field (whataver its called) and other fields since they are writable. I have some other logic in there for dealing with the 10 character limitation for .dbf/.shp files, and there probably is a better way to organize that, but... Also some logic for dealing with the ORIG_OID field if the user provided a field name, otherwise don't populate it (since we never added it in the 1st place).

Hope that helps...

RichardFairhurst · ‎08-04-2010

I realized why the loop made no sense to me. It is written so that every time the loop writes to a field that can have its value set, it also writes to the ouput OID field provided by the user (if the input has an OID field). So if there are 10 writable fields, the user OID field is writen 10 times per record (more writeable fields only makes processing the loop worse due to twice as many write attempts). My VBA training from ESRI on loops has taught me to place as much of my testable conditions outside of the loop as possible and then minimize writing to a cursor within the loop to maximize speed and efficiency.

To achieve those objectives I set some variables outside the loop that have pre-tested whether a write of the user provided OID field should even be attemped within the loop. I then only attempt to write to the user field when the input OID field is being read from the input (if the input has no OID field the user provided field never gets a write attempt and a related warning is reported at the end). All other non-OID fields being read from the input will bypass the user provided field and use your normal field writing code. This revison of the loop results in only a single write attempt for each field in a record. This should definitely improve overall speed of the cursor writing process.

My revision of the loop is as follows:

    inputLayerOidFieldName = dsc.oidfieldname
    #Only attempt to write to the users OID field if the a field name is provided and the input has an OID field.
    if originalOidFieldName not in ["","#"," "] and inputLayerOidFieldName > " ":
        tryOidFieldWrite = True
        origOidPopulateSuccessFlag = True
    else:
        tryOidFieldWrite = False
        origOidPopulateSuccessFlag = False
    while searchRow:
        insertRow = insertRows.newrow()
        for fieldName in inputLayerFieldList:
            try: #you can't write the OID, length, area, etc. fields!
                if outShpDbfFlag == False:
                    # Only attempt to write to the user OID field when the input OID field is being read
                    if tryOidFieldWrite == True and inputLayerOidFieldName == fieldName:
                        try:
                            insertRow.setvalue(originalOidFieldName, searchRow.getvalue(inputLayerOidFieldName))
                        except:
                            origOidPopulateSuccessFlag = False
                            pass
                    # All fields other than the input OID field are passed to this code.  The input OID will also be passed if the user did not provide an output OID field name, but being non-writable it will fail.
                    else:
                        insertRow.setvalue(fieldName, searchRow.getvalue(fieldName))
                else:
                    # Only attempt to write to the user OID field when the input OID field is being read
                    if tryOidFieldWrite == True and inputLayerOidFieldName == fieldName:
                        try:
                            insertRow.setvalue(originalOidFieldName[0:10], searchRow.getvalue(inputLayerOidFieldName))
                        except:
                            origOidPopulateSuccessFlag = False
                            pass
                    # All fields other than the input OID field are passed to this code.  The input OID will also be passed if the user did not provide an output OID field name, but being non-writable it will fail.
                    else:
                        insertRow.setvalue(fieldName[0:10], searchRow.getvalue(fieldName))
            except:
                pass
        recordCount = recordCount + 1
        insertRows.insertrow(insertRow)        
        searchRow = searchRows.next()

    message = "Successfully wrote " + str(recordCount) + " records!"; showPyMessage()
    # Only publish a warning if the OID field write attempt should have been made and failed or if the user provided a field but the input had no OID field.
    if originalOidFieldName not in ["","#"," "] and tryOidFieldWrite == True and origOidPopulateSuccessFlag == False:
        message = "WARNING: For some reason, the original OIDs could not be preserved in the output!"; showPyWarning()
    if originalOidFieldName not in ["","#"," "] and tryOidFieldWrite == False:
        message = "WARNING: The input had no OID field and could not be preserved in the output!"; showPyWarning()

I hope that these revisions meet with your approval and can be incorporated into your final code.

Rich