Does CreateFeatureclass_management preserve field order?

StormwaterWater_Resources · ‎06-17-2013

EDIT: My question about order preservation still stands, but my root needs are better addressed on my follow up question here: (Is there a token for everything EXCEPT SHAPE@): http://forums.arcgis.com/threads/86790-Is-there-a-token-for-everything-EXCEPT-SHAPE?p=306729#post306.... Creating my own "complete" field list proved easier than I was imagining, and was solved by a function provided by Caleb1987 on the above link.

Original question:

The CreateFeatureclass_management help page describes the Template parameter as:

The feature class used as a template to define the attribute schema of the feature class.

Does this include the field order? I ask because the help page for arcpy.da.insertcursor says:

Use an asterisk (*) instead of a list of fields if you want to access all fields from the input table [...]. However, for faster performance and reliable field order,

In other words, if I use a search cursor with "*" for fields then can I directly assign that tuple to my new FC? i.e. is the following valid? Will the field order for the row tuple be preserved between the template and new feature classes (excepting the caveats in the help pages regarding blob and raster types)?

arcpy.CreateFeatureclass_management( newFCpath, newFCname, "POINT", templateFC, "DISABLED", "DISABLED", templateFC )  newRows      = arcpy.da.InsertCursor(newFC,      "*") templateRows = arcpy.da.SearchCursor(templateFC, "*")  for row in templateRows:     newRows.insertRow( row )

PS: I just thought of an additional caveat when using feature classes with different geometries, but I've formed it as a separate question here (Is there a token for everything EXCEPT SHAPE@): http://forums.arcgis.com/threads/86790-Is-there-a-token-for-everything-EXCEPT-SHAPE?p=306729#post306...

PPS: Based on comments from Caleb1987 on my other post I realize the code above will not work because the OID will be a part of the tuple from the search cursor, and I can't assign the OID to the new feature. The question about field order still stands though.

MathewCoyle · ‎06-18-2013

Your cursor has a field property that you can query to get a tuple of field names in the index order they are in.

Eg

templateRows.fields

Edit: To bridge the gap that remains you can just make a dictionary of index/field name values to reference the name instead of the index.

for index, field in enumerate(templateRows.fields):     d[field] = index

So you can do something like this.

for row in templateRows:     val = row[d["field_name"]]

Dictionaries are pretty efficient so I don't imagine this would be a big hit on performance, thought I haven't done any benchmarks on it.

View solution in original post

StormwaterWater_Resources · ‎06-18-2013

I'm still curious if these two cursors will return the same fields in the same order:

arcpy.CreateFeatureclass_management( newFCpath, newFCname, "POINT", templateFC, "DISABLED", "DISABLED", templateFC )

newRows      = arcpy.da.InsertCursor(newFC,      "*")
templateRows = arcpy.da.SearchCursor(templateFC, "*")

However I realize my comments below are not fully correct. The geometry will copy because "*" returns only the XY of the centroid, not the full geometry. The token SHAPE@ returns a geometry object.

Additionally, after looking through the variables while debugging in PyScripter I realized that using the "*" does not return field names, so my question above should really be a special case of the more general: how do you determine what is what when using "*" for fields?

MathewCoyle · ‎06-18-2013

Your cursor has a field property that you can query to get a tuple of field names in the index order they are in.

Eg

templateRows.fields

Edit: To bridge the gap that remains you can just make a dictionary of index/field name values to reference the name instead of the index.

for index, field in enumerate(templateRows.fields):     d[field] = index

So you can do something like this.

for row in templateRows:     val = row[d["field_name"]]

Dictionaries are pretty efficient so I don't imagine this would be a big hit on performance, thought I haven't done any benchmarks on it.

StormwaterWater_Resources · ‎06-18-2013

templateRows.fields

Sure enough. There it is staring me in the face from the help we page. I guess I looked at the parameters, and then skipped down to the examples jumping right over that messy stuff in between.

I plussed and checked your reply because, while it does not directly answer the *original* question, it pretty much renders it moot, and you've directly answered the more general follow-up question.

To bridge the gap that remains you can just make a dictionary of index/field name values to reference the name instead of the index.
for index, field in enumerate(templateRows.fields):
    d[field] = index
So you can do something like this.
for row in templateRows:
    val = row[d["field_name"]]

I've used dictionaries extensively in Perl (where they're called hashes), even for exactly this purpose, but hadn't yet thought to use them in Python. That is an excellent use. I hate using just numbers as indexes. I'll gladly trade the extra code for the clarity of what index you're accessing.

Thank you!

MathewCoyle · ‎06-18-2013

To your original question, the tool 'should' preserve field order. With the caveat the system fields usually rearrange themselves.

Eg OID, Shape, Your_Fields, Shape_Length/Area

Also here are some of the quick trials I ran to show the difference between straight index referencing and dict field name referencing and the code used to generate them below. Was going through ~900k rows in a table.

index time: 19.5900778076
index time: 19.2306036588
index time: 19.4666773613
index time: 19.395765612
index time: 19.5033476872
index time: 19.3282738665
index time: 19.2130868052
index time: 19.0405310563
index time: 19.115207967
index time: 19.1235100947
dict time: 20.4310177626
dict time: 20.4020019139
dict time: 20.3490997958
dict time: 20.4130341569
dict time: 20.2006479064
dict time: 20.3282768531
dict time: 20.168064434
dict time: 20.0710841148
dict time: 20.1822551461
dict time: 20.4257584827

def main_index():
    curs = arcpy.da.SearchCursor(table, '*')
    t0 = _timer()
    for row in curs:
        val1 = row[6]
        val2 = row[7]
    t1 = _timer() - t0
    index_times.append(t1)
    print('index time: {0}'.format(t1))

def main_dict():
    curs = arcpy.da.SearchCursor(table, '*')
    t0 = _timer()
    d = {}
    for index, field in enumerate(curs.fields):
        d[field] = index
    for row in curs:
        val1 = row[d['POLY_NUM']]
        val2 = row[d['TYPE']]
    t1 = _timer() - t0
    dict_times.append(t1)
    print('dict time: {0}'.format(t1))