Is there a substitute for using an InsertCursor to write to a table?

RachaelJohnson · ‎06-15-2015

This is kind of an update to my last post: Python Crashing: Process finished with exit code -1073741819 (0xC0000005)

I understand a bit more *what* is happening to cause my code to crash, but I'm fuzzy on the details of why because my CompSci-Fu is very weak. Apparently, there is some problem with my code forgetting to ask for read/write access, and this is happening due to some problem within the underlying C code of the arcpy functions, most likely. I have PyCharm debugger at my disposal but I am not really sure what to be looking for in the debugger to figure out what I need to change to make my code happy.

I'd really like to *not* change my code but I'm not sure I have an option unless I have some professional intervention.

My code is crashing at two places: sometime between lines 368 and 371 at different numbers of iterations through the outside loop OR it's crashing after line 381 when the table has been converted to Excel and saved, but before the code finishes by itself.

It crashes with the following notification in the console: Process finished with exit code -1073741819 (0xC0000005)

Do you have any suggestions for me? I don't NEED it to have an intermediate step as a GIS table; I can write it directly to a CSV or Excel, I guess. I am concerned because the values I am writing contain commas and I don't want that to screw up the CSV. I also am hesitant to introduce a third party library for code upkeep reasons.

Basically, I'm looking for suggestions on how to improve my code to avoid the error I am getting, knowing that the error most likely originates in the arcpy.da.InsertCursor or arcpy.TableToExcel_conversion modules, with preference for built-in Python functions or libraries.

#### _______________IMPORT MODULES_________________###
print("Preparing necessary files...")
import os
import arcpy
import copy

### ______________INPUT FILES___________________###
outline = r"D:\Python\Inputs\LakeRidge\Lakeridge.gdb\LRoutline"
DA = r"D:\Python\Inputs\LakeRidge\Lakeridge.gdb\SubAreas"
DAID = "DA" # field in DA shapefile where unique DA values exist
soil = r"D:\Python\Inputs\VB_SoilsPRJ.shp"
WTin = r"D:\Python\Inputs\wtdepth1"
DEMin = r"D:\Python\Inputs\2013 5ft DEM.img"
MapLoc = r"D:\Python\Inputs\LakeRidge\LRpythontest.mxd"
WT = arcpy.Raster(WTin)
DEM = arcpy.Raster(DEMin)

### ________________SET ENVIRONMENTS__________________###
# Check out extension and overwrite outputs
arcpy.CheckOutExtension("spatial")
arcpy.env.overwriteOutput = True


# Set Map Document
mxd = arcpy.mapping.MapDocument(MapLoc)

# Create project folder and set workspace
print("Checking for and creating output folders for spatial data...")
WorkPath = MapLoc[:-4]
if not os.path.exists(WorkPath):
    os.makedirs(WorkPath)
arcpy.env.workspace = WorkPath


# Create scratch workspace
ScratchPath = str(WorkPath) + r"\scratch"
if not os.path.exists(ScratchPath):
    os.makedirs(ScratchPath)
arcpy.env.scratchWorkspace = ScratchPath

# Create GDB
path, filename = os.path.split(MapLoc)
GDB = filename[:-4] + ".gdb"
GDBpath = MapLoc[:-4] + ".gdb"
if not os.path.exists(GDBpath):
    arcpy.CreateFileGDB_management(path, GDB)

# Create main output table folder if it does not exist and create project folder
print("Checking for and creating output space for Excel files...")
TabPath = r"D:\Python\Results" "\\"
ProjFolder = TabPath + filename[:-4]
if not os.path.exists(TabPath):
    os.makedirs(TabPath)
if not os.path.exists(ProjFolder):
    os.makedirs(ProjFolder)

# Define location of constraint database and establish GIS table output location
print("Checking for and creating output space for GIS tables...")
CRIT = TabPath + "constraints.xlsx"
BMPFold = ProjFolder + r"\GIS-Tables"
if not os.path.exists(BMPFold):
    os.makedirs(BMPFold)

### _________________VERIFY INPUTS________________###
# Check that all inputs have the same projection and update list with projected file path names
print("Verifying that coordinate systems are the same...")
InSHP = [outline, DA, soil]
InRAS = [WT]
# The base projected coordinate system (PCS) is the DEM's PCS
DEMSR = arcpy.Describe(DEM).spatialReference.PCSCode
for i, l in enumerate(InSHP):
    sr = arcpy.Describe(l).spatialReference.PCScode
    if sr != DEMSR and ".gdb" not in l:
        l = arcpy.Project_management(l, l[:-4] + "PRJ.shp", DEMSR)
        InSHP = l
    elif sr != DEMSR and ".gdb" in l:
        l = arcpy.Project_management(l, l + "PRJ", DEMSR)
        InSHP = l
sr = arcpy.Describe(WT).spatialReference.PCScode
if sr != DEMSR:
        WTPRJ = arcpy.Raster(arcpy.ProjectRaster_management(WT, "WTPRJ", DEMSR, "CUBIC"))
        WTPRJ.save(WorkPath + r"\WT_PRJ")
        WT = WTPRJ

# Assign projected file paths to variable names
outline = InSHP[0]
DA = InSHP[1]
soil = InSHP[2]

### _____________SET PROCESSING EXTENTS____________ ###
# Set cell size
description = arcpy.Describe(DEM)
cellsize = description.children[0].meanCellHeight
print("Setting cell size to DEM cell size: " + str(cellsize) + " ft...")  # Replace ft with code to get units!!!
arcpy.env.cellSize = cellsize

# Create buffer around outline to use as mask
# Buffer distance is in feet
print("Creating an environment mask from the site outline shapefile...")
maskshp = arcpy.Buffer_analysis(outline, ScratchPath + r"\outline_buff", "50 Feet", "", "", "ALL",)

# Convert buffer to raster
mask = arcpy.Raster(arcpy.PolygonToRaster_conversion(maskshp, "Id", ScratchPath + r"\rastermask"))
mask.save(ScratchPath + r"\rastermask")

# Set raster mask and snap raster
print("Setting raster mask and snap raster for project...")
arcpy.env.mask = mask
arcpy.env.snapRaster = mask
arcpy.env.extent = mask.extent

### _______________ASSIGN HSG________________###
# Many soils in the coastal plain are dual group soils, A/D, B/D, or C/D.
# First letter is the drained condition and second letter is the undrained
# condition. Soil is considered drained when the depth to water table is
# greater than two feet from the surface.
# This looks at the HSG assigned to the soil polygon and compares it
# to the depth to WT layer.  If HSG is unknown or invalid,
# HSG is assigned D soil type.

# Convert soils shapefile to raster and assign integer values to HSG.
# A=1, B=2, C=3, 4=D and dual groups A/D=14, B/D=24, C/D=34
# "---" is treated as a D soil
print("Converting dual group soils to single groups...")
SoilUnclass = arcpy.PolygonToRaster_conversion(soil, "HSG", ScratchPath + r"\SoilUnclass", "MAXIMUM_COMBINED_AREA")
SoilClass = arcpy.sa.Reclassify(SoilUnclass, "HSG", arcpy.sa.RemapValue([["A", 1],
                       ["B", 2],
                       ["C", 3],
                       ["D", 4],
                       ["A/D", 14],
                       ["B/D", 24],
                       ["C/D", 34],
                       ["---", 4]]), "NODATA")
SoilClass.save(ScratchPath + r"\HSGraster")

# Determine whether locations with dual groups should be considered drained
# or undrained and assign a single HSG value to those locations
EffHSG = arcpy.sa.Con(SoilClass > 4, arcpy.sa.Con(WT >= 2.0, (SoilClass - 4) / 10, 4), SoilClass)
EffHSG.save(WorkPath + r"\EffectiveHSG")

### ______________SUMMARIZE DA PROPERTIES________________ ###
# Initialize expression to calculate area of polygons
exparea = "float(!shape.area@ACRES!)"

# Summarize total area for each DA
print("Summarizing DA characteristics...")
DAFld = [f.name for f in arcpy.ListFields(DA)]
if "Area" not in DAFld:
    arcpy.AddField_management(DA, "Area", "FLOAT", 7, 3)
arcpy.CalculateField_management(DA, "Area", exparea, "PYTHON")
stat_field = [["Area", "SUM"]]
field_combo = [DAID]
DA_area = arcpy.Statistics_analysis(DA, BMPFold + r"\DA_area", stat_field, field_combo)

# Convert area lookup table to dictionary
arcpy.AddField_management(DA_area, "T_AREA", "FLOAT", 7, 3)
with arcpy.da.UpdateCursor(DA_area, ["SUM_AREA", "T_AREA"]) as cursor:
    for r in cursor:
        r[0] *= 100.00
        r[0] = int(r[0])
        r[0] = float(r[0])/100.00
        r[1] = r[0]
        print r[1]
        cursor.updateRow(r)
DA_area = {r[0]: (r[1]) for r in arcpy.da.SearchCursor(DA_area, [DAID, "T_AREA"])}

# Convert DA shapefile to raster
DAras = arcpy.Raster(arcpy.PolygonToRaster_conversion(DA, DAID, ScratchPath + r"\DAras", "MAXIMUM_AREA"))

# Calculate Slope from DEM for the area of interest, convert to integer
#  and find median slope in each DA
slope = arcpy.sa.Slope(DEM, "PERCENT_RISE")
slope.save(WorkPath + r"\slope")
roundslope = (slope + 0.005) * 100.00  # preserve the last 2 decimal places and round for truncation
slopeINT = arcpy.sa.Int(roundslope)  # convert to integer by truncation
med_slope100 = arcpy.sa.ZonalStatistics(DAras, "VALUE", slopeINT, "MEDIAN", "DATA")  # find median (integer operation)
med_slope100.save(ScratchPath + r"\intslope")
med_slope = med_slope100 / 100.00  # convert back to true median value
med_slope.save(WorkPath + r"\medslope")

# Find the median depth to water table in each DA rounded to 2 decimal places
roundWT = (WT + 0.005) * 100.00  # preserve the last 2 decimal places and round for truncation
WTINT = arcpy.sa.Int(roundWT)  # convert to integer by truncation
med_WT100 = arcpy.sa.ZonalStatistics(DAras, "VALUE", WTINT, "MEDIAN", "DATA")  # find median (integer operation)
med_WT100.save(ScratchPath + r"\intWT")
med_WT = med_WT100 / 100.00  # convert back to true median value
med_WT.save(WorkPath + r"\medWT")

# Combine rasters to give unique combinations
combo = arcpy.sa.Combine([DAras, EffHSG, med_WT100, med_slope100])
combo.save(WorkPath + r"\combo")
combo_table = arcpy.BuildRasterAttributeTable_management(combo)

# Convert integers to usable format
arcpy.AddField_management(combo_table, "HSG", "TEXT", "", "", 6)
arcpy.AddField_management(combo_table, "MEDSLOPE", "FLOAT", 5, 2)
arcpy.AddField_management(combo_table, "MEDWT", "FLOAT", 5, 2)
arcpy.AddField_management(combo_table, "T_AREA", "FLOAT", 7, 3)
with arcpy.da.UpdateCursor(combo_table, ["EFFECTIVEHSG", "HSG", "INTSLOPE", "MEDSLOPE", "INTWT", "MEDWT", "DARAS", "T_AREA"]) as cursor:
    for row in cursor:
        if row[0] == 1:
            row[1] = "A"
        if row[0] == 2:
            row[1] = "B"
        if row[0] == 3:
            row[1] = "C"
        if row[0] == 4:
            row[1] = "D"
        row[3] = float(row[2]) / 100.00
        row[5] = float(row[4]) / 100.00
        for k, v in DA_area.items():
            if row[6] == k:
                row[7] = v
        cursor.updateRow(row)

# Create dictionary for the DA information
DA_summary = {r[0]: (r[1:]) for r in arcpy.da.SearchCursor(combo_table, ["Rowid", "DARAS", "HSG", "MEDSLOPE", "MEDWT", "T_AREA"])}

### _____________COMPARE CRITERIA_____________________ ###
print("Loading constraint database...")
# Convert Excel constraint file to GIS table
compare = arcpy.ExcelToTable_conversion(CRIT, BMPFold + r"\BMP-constraints")
Fields = [f.name for f in arcpy.ListFields(compare)]
# Create dictionary from criteria table
# Code is the key, other values are stored as a list
D = {r[1]:(r[2:]) for r in arcpy.da.SearchCursor(compare, Fields)}

# Codes:
# SDAB Simple Disconnection A&B
# SDCD Simple Disconnection C&D
# SDSA Simple Disconnection C&D with Soil Amendments
# CAAB Sheet Flow Conservation Area A&B
# CACD Sheet Flow Conservation Area C&D
# VFA Sheet Flow Veg Filter A
# VFSA Sheet Flow Veg Filter B,C&D with Soil Amendments
# GCAB Grass Channel A&B
# GCCD Grass Channel C&D
# GCSA Grass Channel C&D with Soil Amendments
# MI1 Micro Infiltration- Level 1
# SI1 Small Infiltration- Level 1
# CI1 Conventional Infiltration- Level 1
# MI2 Micro Infiltration- Level 2
# SI2 Small Infiltration- Level 2
# CI2 Conventional Infiltration- Level 2
# BRE1 Bioretention Basin- Level 1
# BRE2 Bioretention Basin- Level 2
# DS1 Dry Swale- Level 1
# DS2 Dry Swale- Level 2
# WS1 Wet Swale- Level 1
# WS2 Wet Swale- Level 2
# F1 Filter- Level 1
# F2 Filter- Level 2
# CW1 Constructed Wetland- Level 1
# CW2 Constructed Wetland- Level 2
# WP1 Wet Pond- Level 1
# WP2 Wet Pond- Level 2
# WPGW1 Wet Pond with GW- Level 1
# WPGW2 Wet Pond with GW- Level 2
# EDP1 ED Pond- Level 1
# EDP2 ED Pond- Level 2

# Reference:
# 0 - BMP
# 1 - RR
# 2 - PR
# 3 - TPR
# 4 - NR
# 5 - TNR
# 6 - SOIL
# 7 - MAX_SLOPE
# 8 - MIN_CDA
# 9 - MAX_CDA
# 10 - WT_SEP
# 11 - WT_RELAX (boolean)
# 12 - COAST_SEP
# 13 - MIN_DEPTH
# 14 - DEPTH_RELAX (boolean)
# 15 - COAST_MIN_DEPTH
# 16 - PWOP_PREF
# 17 - YEAR_COST

# Create output table for BMPs lumped by DA and HSG with criteria table as template
Lump = arcpy.CreateTable_management(BMPFold + "\\", "BMP-Allowable")
drop = ["OBJECTID", "FIELD1"]
arcpy.AddField_management(Lump, "CODE", "TEXT", "", "", 8)
arcpy.AddField_management(Lump, "DA", "TEXT", "", "", 15)
arcpy.AddField_management(Lump, "HSG", "TEXT", "", "", 6)
arcpy.AddField_management(Lump, "BMP", "TEXT", "", "", 50)
arcpy.AddField_management(Lump, "MOD", "TEXT", "", "", 25)
arcpy.AddField_management(Lump, "RR", "SHORT")
arcpy.AddField_management(Lump, "PR", "SHORT")
arcpy.AddField_management(Lump, "TPR", "SHORT")
arcpy.AddField_management(Lump, "NR", "SHORT")
arcpy.AddField_management(Lump, "TNR", "SHORT")
arcpy.AddField_management(Lump, "PWOP_PREF", "TEXT", "", "", 25)
arcpy.AddField_management(Lump, "YEAR_COST", "TEXT", "", "", 30)
arcpy.DeleteField_management(Lump, drop)
Fields = [f.name for f in arcpy.ListFields(Lump)]

# Create table to build "Rejected BMP" table
Fail = arcpy.Copy_management(Lump, BMPFold + "\\" + r"\BMP-Rejected")
arcpy.AddField_management(Fail, "RSN_FAILED", "TEXT", "", "", 50)
drop = ["BMP", "MOD", "RR", "PR", "TPR", "NR", "TNR", "PWOP_PREF", "YEAR_COST"]
arcpy.DeleteField_management(Fail, drop)
FFields = [f.name for f in arcpy.ListFields(Fail)]

print("Comparing site values to constraints...")
# Compare the lumped parameters to the constraint dictionary
for key, value in DA_summary.items():
    # Duplicate criteria dictionary that can be amended throughout the loop
    BMP = copy.deepcopy(D)
    # Initialize empty dictionary to store BMPs that fail each test
    NoBMP = {}
    Mod = {}
    # Compare lumped values in each DA/HSG pair to those in the constraint table
    for k, v in D.items():
        # Test if soil type is incorrect for each BMP and store reason for failure
        if value[1] not in v[6]:
            NoBMP = "Soil type mismatch"
        # Compare median slope to maximum slope
        if value[2] > v[7]:
            if k not in NoBMP.keys():
                NoBMP = "Slope too steep"
            else:
                NoBMP += ", Slope too steep"
        # Compare WT depths
        if v[10] == 0:
            Mod = "---"
        elif v[13] + v[10] <= value[3]:
            Mod = "---"
        elif v[13] + v[10] > value[3]:
            # Check if coastal modification allows use of practice
            if v[11] == 1:
                coast_WT = v[12]
            else:
                coast_WT = v[10]
            if v[14] == 1:
                coast_depth = v[15]
            else:
                coast_depth = v[13]
            # Notate if coastal modification allows for practice use
            if coast_WT + coast_depth <= value[3]:
                if v[11] == 1 and v[14] == 1:
                    Mod = "Separation and Depth"
                elif v[11] == 1:
                    Mod = "WT Separation"
                elif v[14] == 1:
                    Mod = "Practice Depth"
                else:
                    Mod = "---"
            # Remove the practice if coastal modifications do not help
            if coast_WT + coast_depth > value[3]:
                if k not in NoBMP.keys():
                    NoBMP = "WT proximity"
                else:
                    NoBMP += ", WT proximity"
        # Compare allowable contributing drainage areas (in acres)
        # Maximum CDA neglected because this is lumped analysis
        if v[8] >= value[4]:
            if k not in NoBMP.keys():
                NoBMP = "CDA too small"
            else:
                NoBMP += ", CDA too small"
    # Compare keys in BMP and NoBMP dictionaries. Remove matching pairs from the BMP dictionary.
    for k in BMP.keys():
        if k in NoBMP.keys():
            del BMP
    # Write remaining BMPs to table
    with arcpy.da.InsertCursor(Lump, Fields) as cursor:
        for k, v in BMP.items():
            cursor.insertRow((0, k, value[0], value[1], v[0], Mod, v[1], v[2], v[3], v[4], v[5], v[16], v[17]))

# Sort values in table, effectively ranking them
print("Ranking BMPs...")
LumpSort = arcpy.Sort_management(Lump, BMPFold + "\\LumpSort",
                                 [["DA", "ASCENDING"], ["HSG", "ASCENDING"], ["TPR", "DESCENDING"]])
arcpy.DeleteField_management(LumpSort, ["ROWID"])

# Convert tables to readable format outside of GIS (.xls)
print("Converting good BMPs to Excel format...")
arcpy.TableToExcel_conversion(LumpSort, ProjFolder + r"\Lumped-Result.xls")

DarrenWiens2 · ‎06-15-2015

Check the help for InsertCursor. You don't generally (ever?) use it with 'with', which is tricky, because you do so with SearchCursors and UpdateCursors.

RachaelJohnson · ‎06-15-2015

That's interesting. Is the "with" command treated differently from saying "cursor = X" or is that a preference/conventino/speed thing?

This is my new code for that section:

    # Write remaining BMPs to table
    cursor = arcpy.da.InsertCursor(Lump, Fields)
    for k, v in BMP.items():
        cursor.insertRow((0, k, value[0], value[1], v[0], Mod, v[1], v[2], v[3], v[4], v[5], v[16], v[17]))
    del cursor

It still crashed, unfortunately. But this time it crashed between lines 374 and 379, which is the first time that's happened I think. Hmm!

BillDaigle · ‎06-15-2015

How much data are you working with? Could you just store the intermediate in a python object? Something like a list of lists or a list of dictionaries?

For example:

lump = []
lump.append(0, k, value[0], value[1], v[0], Mod, v[1], v[2], v[3], v[4], v[5], v[16], v[17])

or:

lump = []
lump.append({'field1Name':0, 'field2Name':k, 'field3Name':value[0], 'field4Name':value[1], 'field5Name':v[0],'field6Name':Mod, 'field7Name':v[1], 'field8Name':v[2], 'field9Name':v[3], 'field10Name':v[4], 'field11Name':v[5], 'field12Name':v[16], 'field13Name':v[17]})

You would need to modify the sorting code and figure out a way to get it into excel, but there are lots of examples out there on how to to that.

RachaelJohnson · ‎06-15-2015

These test runs have all been running on a 15 acre site split into 7 subareas with 2 soil types; the outside dictionary loop for this set of data only has 14 iterations. Dictionary D has around 30 key:value pairs, so at maximum, the final table will have 30 entries for each of the 14 rows, or 420 entries. There's really only about 250 rows in reality for this particular site. However, three of my inputs (DEM, WT, and soil) have spatial information for a whole city. The DEM and WT are rasters with cell size of 5 and are 2.4 GB each. I can't seem to find how big the soil shapefile is but it's not any bigger than 2.4 GB.

Yes, ultimately if I choose to go with a direct-to-CSV/Excel route, I'd be storing an intermediate Python object. I was trying to figure out how to store everything as a dictionary since the key values would duplicate for each row and overwrite themselves, but the list of lists solves that issue. So, thank you!

EDIT: Although, maybe I can try storing the intermediate and then using the cursor outside of the initial dictionary loop.

EDIT 2: I changed the end of my code and it still crashes after converting stuff to Excel. But I do think it's going a lot faster, at least. I may have to bit the bullet and bring in an outside Excel writer. Here is the end of my code now.

    # Compare keys in BMP and NoBMP dictionaries. Remove matching pairs from the BMP dictionary.
    for k in BMP.keys():
        if k in NoBMP.keys():
            del BMP
    # Add remaining BMPs to an intermediate list of lists
    for k, v in BMP.items():
        BMPs.append([0, k, value[0], value[1], v[0], Mod, v[1], v[2], v[3], v[4], v[5], v[16], v[17]])

# Write BMP list to GIS table
cursor = arcpy.da.InsertCursor(Lump, Fields)
for l in BMPs:
    cursor.insertRow((l[0], l[1], l[2], l[3], l[4], l[5], l[6], l[7], l[8], l[9], l[10], l[11], l[12]))
del cursor

# Sort values in table, effectively ranking them
print("Ranking BMPs...")
LumpSort = arcpy.Sort_management(Lump, BMPFold + "\\LumpSort",
                                 [["DA", "ASCENDING"], ["HSG", "ASCENDING"], ["TPR", "DESCENDING"]])
arcpy.DeleteField_management(LumpSort, ["ROWID"])

# Convert tables to readable format outside of GIS (.xls)
print("Converting good BMPs to Excel format...")
arcpy.TableToExcel_conversion(LumpSort, ProjFolder + r"\Lumped-Result.xls")

DanPatterson_Retired · ‎06-22-2015

You can experiment with the combination of NumPy, FeatureClassToNumPyArray (or TableToNumPyArray) and MatPlotLib (which is installed with ArcMap) to generate *.csv files.

The order of operations:

import the 3 modules, numpy, arcpy and part of matplotlib
take your input file, and its spatial reference
specify an output filename
run FeatureClassToNumPyArray on it...now... you can either use
explode_to_points=True get all points making up the geometry ... downside... lots of repetitive attributes
explode_to_points=False the centroid for non-point geometry

Consider:

import numpy as np
import arcpy
import matplotlib.mlab as MPL
in_FC = r'F:\Test\Vegetation.shp'      # alter to suit
SR = arcpy.SpatialReference(2951)  #'NAD_1983_CSRS_MTM_9'
output_csv = r'F:\Test\test.csv'
arr = arcpy.da.FeatureClassToNumPyArray(in_FC,"*",spatial_reference=SR,explode_to_points=False)
MPL.rec2csv(arr,output_csv,delimiter=",",withheader=True)

Sample *.csv output

FID,Shape,MIFCODE,ATT1,ATT2,TYPE,Area,Perimeter,Xc,Yc

0,[ 340029.9297 5025429.8816],pl_veg_cov_c,1,11,Deciduous - successional stage unknown,4669.29,455.98399999999998

1,[ 340054.9509 5024769.849 ],pl_veg_cov_c,1,3,Early successional mixed forest,39144.995999999999,1359.9880000000001

2,[ 340337.5568 5024970.6994],pl_veg_cov_c,1,4,Late successional mixed forest,555562.47999999998,6123.2719999999999

3,[ 340362.3134 5024920.8256],pl_veg_cov_c,1,4,Late successional mixed forest,16332.526,830.67100000000005

More details are posted on my blog at Before I forget ... # 10 ... Features and attributes to *.dbf or *.csv ...