Referencing Shapefiles in Python and Printing Shapefile Entries

RachaelJohnson · ‎04-08-2015

I am a complete Python novice and I have been searching and searching and so far, I am unable to find a straightforward answer to my question.

I want to write a statement that checks if a shapefile has a field "Area", and if not, make one. This is what I have:

outline="LRoutline"

if "Area" not in outline:

arcpy.AddField_management(outline, "Area", "FLOAT", 6, 3)

The If statement isn't working. When I run it again, the code gets hung up because there is already a field named "Area". Obviously, the code is not looking if there is a field named "Area" but rather it is looking for that string in the variable. How do I reference the fields inside a shapefile within an If statement? outline.Area doesn't work and neither does !Area!. I have tried the SearchCursor thing using the following code and it outputs <Cursor object at 0x137ed0f0[0x13587f80]>, and I don't know what that means.

arcpy.SearchCursor(outline)

Furthermore, I want to sum the areas in that Area field after they are calculated, so I need to be able to reference the field, right? I looked at the arcpy.Statistics_analysis() function and I need to create a whole separate table to pull one value from the shapefile? That's not required to view the summary statistics in the ArcGIS GUI (right click on field in attribute table-> Summary...). Is there a less cluttered way to sum the entries in a shapefile field in Python?

I have downloaded the shapefile package but that doesn't seem to fit what I need either.

BlakeTerhune · ‎04-08-2015

It's referring to an index of the list of values in the row variable. The search cursor returns each row as a tuple (which is a kind of list in Python). The number of indexes in that tuple is equal to the number of fields you used in the arguments to create the cursor. In this case, there's only one field so it returns a tuple with only one value as

(23456.93426,)

The comma in there is important because a single item tuple still has a comma afterwards (unlike a list).

So when I say row[0], it's getting the value of the first item, which is also known as index 0. Index 1 would be the second item, and so on. If I just said row, it would return all field values in the whole row as a tuple instead of just the single value as a number.

RachaelJohnson · ‎04-08-2015

So if I had

arcpy.da.SearchCursor(outline, outFld)])

instead of

arcpy.da.SearchCursor(outline, ["Area"])]),

I would have to change the code to

siteArea= sum([row[3] for row in arcpy.da.SearchCursor(outline, [outFld])])

if "Area" was the fourth field in the shapefile?

BlakeTerhune · ‎04-08-2015

Yes. Since I don't know exactly what outFld contains, I'll draft an example...

fc = r'PathToLayer\LayerNamel'
fields = ["ObjectID", "SomeField", "Area"]

with arcpy.da.SearchCursor(fc, fields) as sCursor:
    for row in sCursor:
        print row  ## Prints tuple of all field values
        print row[0]  ## Prints single field value in first field, ObjectID
        print row[1]  ## Prints single field value in second field, SomeField
        print row[2]  ## Prints single field value in third field, Area

The example here using the with statement is the more typical way you'll see the cursors being used. The example I posted earlier uses the cursor in what's called list comprehension, which uses shorthand code to create a list from an iterable. The code from James Crandall also uses list comprehension to create a list of field names.

RachaelJohnson · ‎04-08-2015

Awesome, thank you for sticking it out with me and answering my questions. Today is my first day coding with Python after only doing the "Python for Everyone" tutorial in the Esri training modules.

If I wanted to print the second field value in the second field, would it be

fc = r'PathToLayer\LayerNamel'  
fields = ["ObjectID", "SomeField", "Area"]  
  
with arcpy.da.SearchCursor(fc, ["SomeField"]) as sCursor:  
    for row in sCursor:  
        print row[1]

BlakeTerhune · ‎04-08-2015

Almost, but not quite. That would give you an error because there is nothing at index 1.

The field_names parameter for the SearchCursor can be declared as a list directly in the call to the method or you can assign it to a variable ahead of time and then pass that variable containing the field names list as the field_names parameter; don't do both.

In your example, you've created a fields variable that contains a list of field name strings. Then, when you create the cursor, you explicitly tell it to use only "SomeField". What you'd get are rows as a single item tuple with one field value for SomeField. If you instead wanted access to the field values from all three fields you specified in the fields variable, create the cursor with fields as the field_names parameter (instead of just ["SomeField"] like I did in the example just before.

These two examples would both do the same thing: print the field value of SomeField for every row in the feature class.

fc = r'PathToLayer\LayerNamel'
fields = ["ObjectID", "SomeField", "Area"]

with arcpy.da.SearchCursor(fc, fields) as sCursor:
    for row in sCursor:
        print row[1]

fc = r'PathToLayer\LayerNamel'

with arcpy.da.SearchCursor(fc, ["SomeField"]) as sCursor:
    for row in sCursor:
        print row[0]

Choose one or the other, but don't combine them both.

IanMurray · ‎04-08-2015

It would depend on what the variable outFld contains. If it contains a list of fields of which "Area" is the 4th field in the list, then yes you would use row[3]. However, if it only contained a single field, "Area", then you would use row[0]. Generally it is good practice to only put the fields into a cursor that you will actually be using to check or update data, since it will run faster if it needs values from 1-2 fields then from 20.

DanPatterson_Retired · ‎04-08-2015

Rachel .... for a python moment...strings are iterables, just as lists are. To see where the pitfalls can arise, examine the following code segments:

>>> case_1='LRoutline'
>>> for i in case_1: print i
>>> print 'line' in case_1
>>> print 'Area' in case_1

Line 1 case_1 is defined as your input field

Line 2 let's demonstrate that it is an interable...result

>>> for i in case_1: print i
... 
L
R
o
u
t
l
i
n
e
>>>

Line 3 let's see if sequences can be found there... result

>>> print 'line' in case_1
True
>>>

Line 4 let's assume you think it is a file and want the Area field

>>> print 'Area' in case_1
False
>>>

Now for a more practical example...

>>> import arcpy
>>> shp_file = 'c:/temp/cube.shp'
>>> for field in arcpy.ListFields(shp_file): print field.name
... 
FID
Shape
Id
>>> 'Id' in arcpy.ListFields(shp_file)
False
>>>

Now the above produces a false because as pointed out in a subsequent post you need the field.name not the field itself.

>>> shp_file = 'c:/temp/cube.shp' 
>>> field_names = [field.name for field in arcpy.ListFields(shp_file)]
>>> field_names
[u'FID', u'Shape', u'Id']
>>> 'Id' in field_names
True
>>>

Hope this helps for the future.

RachaelJohnson · ‎04-08-2015

>>> 'Id' in arcpy.ListFields(shp_file)

I don't understand why that returns False. Is it because arcpy.ListFields(shp_file) just returns: [<Field object at 0x5498250[0x5444ad0]>, <Field object at 0x5498110[0x5444908]>, <Field object at 0x54981b0[0x5444a40]>, <Field object at 0x5498190[0x5444848]>] ?

BlakeTerhune · ‎04-08-2015

Yes, that's correct. I think what Dan is getting at is that although ListFields returns an iterator, the field objects it's iterating can't be compared directly, you have to call the property of the field object; like with .name

DanPatterson_Retired · ‎04-08-2015

Thanks Blake...I edited my post to reflect this