Attribute Table to CSV

ChrisMathers · ‎04-27-2010

This isnt a question its a post about a result! I spent some time trying to find a way to take the attribute table of a shapefile and make it a CSV from python. I looked at an ArcScript that was rather complicated and didnt work with the current geoprocessor anyways. The old forums had a couple topics on the matter as well and none of the replys were really what I wanted. I wanted to avoid using win32com to call to excel or anything like that. Thats just seems messy to me.

The answer came in a flash before leaving work last night, combine gp.Describe with gp.SearchCursor. Using the CSV library that ships with Python, you can open a new file and pass list objects into it. I iterate though the rows with a search cursor and for each row, iterate through the fields with a for-in loop. Each pass of the for loop appends the value of the field for the row to a list. That list is written to the CSV as a full line in the table. Youll notice in the while loop that during each pass I create and delete the list object "line." This may not be nessessary but I wanted to be safe and not accidentally be adding to the same very long list instead of creating a new list for each line.

import arcgisscripting, csv
gp=arcgisscripting.create(9.3)

output=open(r'C:\GIS Projects\sandbox\test.csv','w')
linewriter=csv.writer(output,delimiter=',')

fcdescribe=gp.Describe('C:\\GIS Projects\\sandbox\\New_Shapefile(3).shp')
flds=fcdescribe.Fields

header=[]
for fld in flds:
    value=fld.Name
    header.append(value)
linewriter.writerow(header)

cursor=gp.SearchCursor('C:\\GIS Projects\\sandbox\\New_Shapefile(3).shp')
row=cursor.Next()

while row:
    line=[]
    for fld in flds:
        value=row.GetValue(fld.Name)
        line.append(value)
    linewriter.writerow(line)
    del line
    row=cursor.Next()
    
del cursor
output.close()

This method has the drawback of giving you a field for the OID and geometry in the CSV. You can get around this by getting rid of the gp.Describe and just making a list object full of your field names and using that. I made this process for a larger script that deals with updates to a parcels feature class so I will always be exporting the same fields.

header=['Id','C','D','AtoC','BtoD']
linewriter.writerow(header)

cursor=gp.SearchCursor('C:\\GIS Projects\\sandbox\\New_Shapefile(3).shp')
row=cursor.Next()

while row:
    line=[]
    for fld in header:
        value=row.GetValue(fld)
        line.append(value)
    linewriter.writerow(line)
    del line
    row=cursor.Next()

I hope this helps someone out there because it had me stumped for a good while. For some reason excel wants to open this with a blank line between each record. Im not sure why but I think it has to do with what CSV dialect you set it to. I dont set one here because I change the extension to txt and cram the table into a building permit tracking system. If you want to use this to make fancier CSVs the documentation for the CSV library is here.

EDIT: Just had a thought. A list is a 0 based array, so if you want to keep gp.Describe, you can just pass the list using the syntax line[2:]. That way you add every list item except the first two (OID and Shape).

RDHarles · ‎04-28-2010

Interesting idea.

I did something similar about a year ago using a SearchCursor, ListFields & a simple File I/O.

import arcgisscripting, os
gp = arcgisscripting.create()

gp.workspace = os.getcwd()

for table in os.listdir(''):
    if table.endswith('dbf'):

        # Make the output file name the same as the shapfile
        outFile = open(table[:-3]+"csv", 'a')        

        ## Append all the field names to the fldList
        fldList = []
        fld = gp.listFields(table)
        field = fld.next()
        while field:
            # Don't include ESRI's "FID" or "SHAPE" or "OID" fields.
            if field.name <> "FID" and field.name <> "Shape" and field.name <> "OID":
                fldList.append(field.name)
            field = fld.next()        

        ## Write the field names to the outFile
        # Get the length of the fldList so we know when not to add the delimiter
        total = len(fldList)
        count = 0
        for fieldName in fldList:            
            count = count+1            
            delimeter = ","
            # If you're not the last value in the list, add the delimiter; else, don't add it
            if not count == total:
                outFile.write(fieldName+delimeter)
            else:
                outFile.write(fieldName+"\n")

        printrow = 0
        ## Write the row values to the outFile
        rows = gp.SearchCursor(table)
        row = rows.Next()
        while row:
            printrow=printrow+1
            print printrow
            # Create a list for each row
            valList = []
            for fieldName in fldList:                
                delimeter = ","
                value = row.GetValue(fieldName)
                # Append each row of values to the valList
                valList.append(value)            
            count = 0            
            # Go through the list of values and add the delimiter
            for val in valList:                
                count = count+1
                # If you're not the last value in the list, add the delimiter; else, don't add it
                if not count == total:
                    outFile.write(str(val)+delimeter)
                else:                    
                    outFile.write(str(val)+"\n")
            row = rows.Next()

print "\nDone.\n"

DanPatterson_Retired · ‎04-28-2010

You can also use sets to remove unwanted fields from lists, then convert back to a list. This removes the need for potentially long if statements. In the example below, the all_fields could be replaced by gp.ListFields. The alternate method not involving sets follows.

>>> no_good_fields
['FID', 'Shape', 'OID']
>>> all_fields
['Area', 'Perimeter', 'Shape', 'FID', 'OID']
>>> good_fields = list(set(all_fields) - set(no_good_fields))
>>> for a_field in good_fields:
...  print a_field
...  
Perimeter
Area

>>> no_good_fields = ['FID', 'Shape', 'OID']
>>> all_fields = ['Area', 'Perimeter', 'Shape', 'FID', 'OID']
>>> for a_field in all_fields:
...  if a_field not in no_good_fields:
...   print a_field
...   
Area
Perimeter

KeithSandell · ‎05-07-2010

import arcgisscripting, csv
gp=arcgisscripting.create(9.3)

output=open(r'C:\GIS Projects\sandbox\test.csv','w')
linewriter=csv.writer(output,delimiter=',')

fcdescribe=gp.Describe('C:\\GIS Projects\\sandbox\\New_Shapefile(3).shp')
flds=fcdescribe.Fields

header=[]
for fld in flds:
    value=fld.Name
    header.append(value)
linewriter.writerow(header)

cursor=gp.SearchCursor('C:\\GIS Projects\\sandbox\\New_Shapefile(3).shp')
row=cursor.Next()

while row:
    line=[]
    for fld in flds:
        value=row.GetValue(fld.Name)
        line.append(value)
    linewriter.writerow(line)
    del line
    row=cursor.Next()
    
del cursor
output.close()

I really appreciate the code that you shared. I have to work with a lot of parcel data as well, which happens to be why I was looking for this solution...as part of my bigger solution.

I used the code as is with the exception that I passed in some arguments, but I get the following error when I run the code:

<type 'exceptions.TypeError'>: 'geoprocessing list object' object is not iterable

It creates the csv file and then errors out. Any thoughts on what might be wrong? Thanks

Here is my code:


# Import system modules
import sys, string, os, arcgisscripting, csv

# Create the Geoprocessor object
gp = arcgisscripting.create()


# Local variables...
OutputFileName = sys.argv[1]
Target_FC = sys.argv[2]

output=open(OutputFileName,'w')
linewriter=csv.writer(output,delimiter=',')

fcdescribe=gp.Describe(Target_FC)
flds=fcdescribe.Fields

header=[]
for fld in flds:
    value=fld.Name
    header.append(value)
linewriter.writerow(header)

cursor=gp.SearchCursor(Target_FC)
row=cursor.Next()

while row:
    line=[]
    for fld in flds:
        value=row.GetValue(fld.Name)
        line.append(value)
    linewriter.writerow(line)
    del line
    row=cursor.Next()
    
del cursor
output.close()

DanPatterson_Retired · ‎05-07-2010

you appear to be using arcmap 9.2 or below, the example below is a snippet which demonstrates the differences. In 9.3, python list objects are now returned versus enumerations in previous versions. You will have to modify your code to include the reset and next lines (in two locations) to reflect the differences.

try:                       #9.2 and below
  fields.Reset()
  field = fields.Next()
  while field:
    if field.Type in ok_fields:
      num_fields.append(field.Name)
    field = fields.Next()
except:                    #9.3 and above
  for field in fields:
    if field.Type in ok_fields:
      num_fields.append(field.Name)

DaleHoneycutt · ‎05-13-2010

This is sort of related to this... There's a script tool on the Model & Script tool gallery to output a table to HTML or excel. It's here

ChrisMathers · ‎05-17-2010

I actually have an update for this. If you are using Windows, you have to open the output in binary mode. When each line is added python will put /r/n at the end of each line (a carriage return and a new line). Windows will intercede and add /r to each line which gives you /r/r/n. This means that excel or ArcGIS will have a blank line between each record. The new line should be:

output=open(r'C:\GIS Projects\sandbox\test.csv','wb')

Thanks for the link Dale. I was trying to use as little outside of the main python libraries as possible. If there was a DBF library I would forgo the geoprocessor entirly for this.

DanPatterson_Retired · ‎05-18-2010

did you google "dbase Python" since there are a number out there ie
http://code.activestate.com/recipes/362715-dbf-reader-and-writer/

LindaTedrow · ‎12-14-2010

how would you get specific fields to write to csv file.
For instance with LiDAR data, maybe, how would you only write the six fields:
"ID", "x_1", "y_2", "i", "r", "height"?

I can get the entire shape or dbf file to write as CSV but how to get just some of the fields?

Linda

ChrisMathers · ‎12-15-2010

For this method just make a list of those fields and iterate over them to make your rows.
So the variable "header" would be ["ID", "x_1", "y_2", "i", "r", "height"].

I have found a flaw with this though. Using the csv library is great if you are reading and writing a csv, but if you are only writing as is the case here, it is actually better to use the built in file object instead because the csv library cant write non-ascii text. This isnt always an issue but occasionally you may get some odd unicode character in on accident. The benefit of using the csv library is that you dont have to worry about putting in your \n or joining the list into a string. The library does that for you.

Here I used describe to list the fields and iterate over that. You can take that out if you would like and only put in a list of the fields you want to use.

outfile=open(YOUR CSV.csv'w') #Open a new file to write
FeatureClass=arcpy.Describe(YOUR FEATURE CLASS)
fields=FeatureClass.Fields #Use arcpy.Describe to get the field names from the export of parce_test
 
header=[] #An empty list that will be populated with the field names
for field in fields: #For a field in the list fields,
    header.append(field.name) #append the name to the header list
outfile.write('%s\n' % ','.join(header[2:]))#Write the header list to the TXT. The list the sliced using [2:] to drop the first two list items which are OID and Shape because I dont want them
 
cursor=arcpy.SearchCursor(YOUR FEATURE CLASS)
for row in cursor
    try:
        line=[]#Empty list to hold the data for the each row
        for field in fields:#Same deal as before. For a field in the list fields,
            line.append(row.GetValue(field.Name))#add that to the end of the list "line" and repeat till out of fields
        outfile.write('%s\n' % ','.join(line[2:]))#Write the list to the TXT, again dropping the first two fields
    except:
        print 'Borked row: %s' % row.OID #If a line has a problem print the OID and move on. 
 
del cursor #Delete the cursor object because this is what is holding a schema lock
outfile.close()#Close the TXT so that a lock is dropped from it as well.

Yours could be:

outfile=open(YOUR CSV.csv'w') #Open a new file to write
 
header=["ID", "x_1", "y_2", "i", "r", "height"] 
outfile.write('%s\n' % ','.join(header))#Write the header list to the TXT.  
cursor=arcpy.SearchCursor(YOUR LIDAR)
for row in cursor
    try:
        line=[]
        for field in header:
            line.append(row.GetValue(field))
        outfile.write('%s\n' % ','.join(line))
    except:
        print 'Borked row: %s' % row.OID
 
del cursor
outfile.close()