lmurray22

Writing data from feature class to delimited text file

Discussion created by lmurray22 on Apr 4, 2013
I have several feature classes with values that I would like to export to a tilda (~) delimited text file using a Python script.  My customer says tildas are the easiest delimiters for him to work with.  I found the csv.writer module available in Python and it was working perfectly when I was only exporting 6 fields of data.   However, when I added another 6 fields of data to the export, the script began to crash, giving me the error "IOError: [Errno 22] Invalid argument."  After a lot of searching on the web, I've figured out this is an issue with Windows, not Python, and appears to be related to this article (http://support.microsoft.com/default.aspx?scid=kb;en-us;899149).  The feature class is about 375Mb.  I've already tried adjusting the csv.writer to open the file as "w+b," as the article suggests, but that didn't make a difference.  What's weird is there's no consistency when the error pops up.  Sometimes all of the files are written successfully, sometimes the script crashes after writing a few lines to teh first file, and literally anywhere in between.

I'm at a loss on how to solve this.  I'm still pretty new to Python and feel like maybe there's a way to break up the data so it only X number of megabytes of data at a time.  However, I don't know how to do that or even how to determine how small to break of the files.  The article mentions a 64Mb limit but, as I said, I've seen the script crash after a few lines are written to the first file (i.e. far less than 64Mb of data).  Any help would be greatly appreciated.  FYI, I'm running this script on a Windows Server 2008 R2 OS.  Below is my code.

# ConvertFCtoTXT.py
# Created on: March 20, 2013
# Description: Converts DCYF/OLCR geocoded feature classes to tilda-delimited text files.
# Notes: Logging module requires this script be called through the DCYF-OLCR Daily Geocode Batch.py script.
# ---------------------------------------------------------------------------------------------------------------

import arcpy, csv, logging
from arcpy import env

#Set up variables
env.workspace = "C:\\Daily Geocode Processes\\DCYF OLCR Daily Geocode\\Workspace\\Workspace GDB.gdb\\GeocodedFeatures"
ProjectGDB = "C:\\Daily Geocode Processes\\DCYF OLCR Daily Geocode\\Workspace\\Workspace GDB.gdb\\GeocodedFeatures"
CubeWorkspace = "\\\\cubed03\\Geocoding"

# Log start date/time.
logging.info("Started ConvertFCtoTXT.py.")

# Create iterator to run process for each feature class in WorkspaceGDB.
for fc in arcpy.ListFeatureClasses():
   InputFC = ProjectGDB + "\\" + fc
   OutputTXT = CubeWorkspace + "\\" + fc.rstrip("_GC") + ".txt"

   # Module defines get_fields function. Allows writerow to iterate through each row in the table.
   # Yield iterates through each row, without storing the rows in memory.  Yield stops when no more records exist.
   def get_fields(InputFC, Fields):
      with arcpy.da.SearchCursor(InputFC, Fields) as cursor:
         for row in cursor:
            yield row

   # Describe opens the feature class properties, including the field names, to function
   DescribeFC = arcpy.Describe(InputFC)

   # Selects only the fields necessary to export
   FieldNames = [field.name for field in DescribeFC.fields if field.name
                 in ["Key", "GCAcc", "Address_Std", "City_Std", "State_Std", "ZIPCode_Std", "County_Std", "X_Coordinate", "Y_Coordinate"]]

   # Defines rows using the get_fields function (see above)
   rows = get_fields(InputFC, FieldNames)

   # Opens the output file, prepares it to write as ~ delimited, and writes the headers and rows to the table
   with open(OutputTXT,'w+') as out_file:
      out_writer = csv.writer(out_file, delimiter="~")
      out_writer.writerow(FieldNames) #writes the headers to the file
      for row in rows:
         out_writer.writerow(row) #writes the rows to the file

   del row
   del rows
   out_file.close()
   print OutputTXT + " done"

del fc

# Log completion date/time
logging.info("Completed ConvertFCtoTXT.py.")

Outcomes