Performance for Append vs Insert Cursor

8174
21
Jump to solution
08-09-2013 11:09 AM
AlanToms
Occasional Contributor
Ok I have a feature class (260k records) and 3 tables.  I need to get specific information (about 20 fields) from each into a new feature class.  I have several different ways to do this and I???m curious if anyone has done any performance testing and  knows which of my options would be ???best???.

Option 1
I could create several joins and append the data into the new feature class, with some additional field calculations for cleanup.

Option 2
Create a python script that would loop through each record, search cursor the other tables for data, then insert cursor into the new feature class.


Thank you
Alan
0 Kudos
21 Replies
by Anonymous User
Not applicable
No problem, Richard. I still owe you for the VBA help you gave me a while back! This worked for me...Chris probably has a better way to sort by date but here's what I came up with:

# fix date, returns a dictionary with sortable values
def FixDate(in_date):
    month_dict = {'Jan':'01','Feb':'02','Mar':'03','Apr':'04',
                  'May':'05','Jun':'06','Jul':'07','Aug':'08',
                  'Sep':'09','Oct':'10','Nov':'11','Dec':'12'}
    
    # split date string by space and remove comma
    in_date_items = in_date.replace(',','').split()

    # separate year,month,day
    year = in_date_items[-1]
    month = month_dict[in_date_items[0]]
    day = in_date_items[1].zfill(2)

    # sort date dictionary
    sort_dict = {}
    sort_dict[in_date] = '/'.join([year,month,day])          
    return sort_dict



# your dictionary of applicants
apps = {'Smith, John':[['Smith, John', 'Jan 10, 2013', 'Applied'],
        ['Smith, John', 'Feb 3, 2013' 'Assigned'],
        ['Smith, John', 'Mar 25, 2013', 'Tested'],
        ['Smith, John', 'Jun 11, 2013', 'Hired']],
        'Smith, Jack':[['Smith, Jack', 'Jan 10, 2013', 'Applied'],
        ['Smith, Jack', 'Feb 7, 2013', 'Assigned'],
        ['Smith, Jack', 'Mar 25, 2013', 'Tested'],
        ['Smith, Jack', 'Jun 5, 2013', 'Rejected']],
        'Smith, Kim':[['Smith, Kim', 'Jan 10, 2013', 'Applied'],
        ['Smith, Kim', 'Feb 12, 2013', 'Rejected']]}

# return most recent action for each person
action_dict = {}
for name, info in apps.iteritems():
    #iterates through each list of actions to find most recent action
    for details in info:
        dates = {}
        dates[details[1]] = details[-1]
   
    recent = sorted(list(FixDate(dt) for dt in dates.keys()))[-1] # grabs the last item (max in this case)
    action_dict[name] = [recent.keys()[0], dates[recent.keys()[0]]] # copies this into a new dictionary
    print name, recent.keys()[0], dates[recent.keys()[0]]



        

This printed:

Smith, Kim Feb 12, 2013 Rejected
Smith, Jack Jun 5, 2013 Rejected
Smith, John Jun 11, 2013 Hired


This also stores these values in another dictionary called "action_dict" so if you want to look up someone later you can do:
print action_date['Smith, John']

To get:
['Jun 11, 2013', 'Hired']

or you can print them all by:
for name, action in action_dict.iteritems():
    print name, action


this prints:
Smith, Kim ['Feb 12, 2013', 'Rejected']
Smith, Jack ['Jun 5, 2013', 'Rejected']
Smith, John ['Jun 11, 2013', 'Hired']


EDIT: @ RICHARD: I found a mistake in my code, I changed the part in red
0 Kudos
RichardFairhurst
MVP Honored Contributor
Just for the record, the Summary Statistics tool can do a Min and Max summary of a date field, but the process and result is far from ideal.  In the Summary Field drop down you have to type in the name of a date field, since the date fields do not appear in the drop down.  You have to ignore the warning that you have chosen a field type that has no valid statistics options.  Then you can choose the Min or Max summary options for the date field.

However, when the tool runs the output Min or Max fields will convert the date to a double field.  The number is a correct representation of the appropriated date (at least for fgdb tables), so if you add a date field and use the field calculator to transfer the Min or Max date summary numbers into the added field, the correct dates will appear in the calculated date field.

So that is the workaround for using Summary Statistics tool to do a Min or Max date field summary.  Far from ideal, but it can work.
0 Kudos