Performance issues in python using gdb

martinschaefer1 · ‎10-23-2017

Hi,

I've found in a couple of examples that using gdb as output in python toolboxes is much slower than standard files or shapefiles. Running in ArcPro 2.0.1

Toolbox that has a function that needs arcpy.sa.sample output for further processing.
1. output sample results to .dbf, read back into a dataframe, delete dbf - 6.5s
2. output sample results to scratch gdb, which is cleaner, read back into df - 25s
toolbox that outputs new feature classes
1. outputs to shp - 5.19s
2. outputs to gdb - 18.8s

In both cases the only difference in code is the storage in a gdb. Is anyone else seeing this performance issue using gdb storage?

Cheers

M.

Code snippet example for 1.1:
out_table = (os.path.join(dir, 'zzz_{}.dbf'.format(os.path.basename(in_fc)[:-4])))
Sample(in_dem, in_fc, out_table, 'NEAREST')
#read sample output into df
fields = [f.name for f in arcpy.ListFields(out_table)]
df_sample = pd.DataFrame(arcpy.da.TableToNumPyArray (
                           in_table = out_table,
                           field_names = fields,
                           skip_nulls = True
                           ))
#remove sample output
for _ in os.listdir(dir):
if _.startswith('zzz_'):
   os.remove(os.path.join(dir, _))

Code snippet example for 1.1:

out_table = arcpy.CreateScratchName(workspace=arcpy.env.scratchGDB)
Sample(in_dem, in_fc, out_table, 'NEAREST')
#read sample output into df
fields = [f.name for f in arcpy.ListFields(out_table)]
df_sample = pd.DataFrame(arcpy.da.TableToNumPyArray(
                        in_table = out_table,
                        field_names = fields,
                        skip_nulls = True
                        ))