My walk code will not skip reading the worksheets inside excel files. Staff have excel files with a huge amount of worksheets that is slowing my walk down so that it is basically unusable. I think it is still reading the excel worksheets in the else statement.
import arcpy, os, traceback, sys arcpy.env.overwriteOutput = True workspace = r"C:\Users\Documents\GisData" arcpy.env.workspace = workspace try: walk = arcpy.da.Walk(workspace) txt = open(r"C:\Users\Documents\StaffGISLibrary.txt", 'w') for dirpath, dirnames, filenames in walk: if arcpy.Exists(dirpath): #describe = arcpy.Describe(dirpath) if dirpath.endswith(('.xls', '.xlsx', '.txt')): print "skipping excel file" pass else: for filename in filenames: fullpath = os.path.join(dirpath, filename) describe = arcpy.Describe(fullpath) print "writing " + fullpath txt.write(fullpath + "," + filename + "," + describe.dataType + "\n") else: print "DOES NOT EXIST" pass del filename, dirpath, dirnames, filenames txt.close() except Exception, e: pass # If an error occurred, print line number and error message import traceback, sys tb = sys.exc_info()[2] print "Line %i" % tb.tb_lineno print e.message finally: raw_input("Finished!")
Solved! Go to Solution.
Sorry for wasting everyone's time. it was just slow and I assumed it was stalling. guess I need to be more patient.
just a hunch, shouldn't it be the filenames that you are looking for the xls extensions not the dirpath?
filenames goes into the worksheets themselves, was trying to avoid that
I would throw in some print statements then, or use os.path to exclude excel files, although they should essentially work the same
I got to where I am by using print statements. the code works perfectly until I add in one of the large excel files (large amount of worksheets and records). it's not printing the excel files but it's still reading them in the else statement because my cursor stalls. Hope that makes sense. It's like dirpath or filenames is being pulled from the first part of the code before I excluded the excel files in the else statement.
dirpath isn't that the directory path
I would have put the excel check in the filename section or exclude it as a data type in the datatype section or provide an inclusion list of files you wish to examine
I originally had it in the filename section but it had to go in and read all the worksheets. when I changed that it worked much faster. I like the inclusion idea but not sure how to do that with file geodatabases and the feature classes within.....
I suppose I would need to run a "list" this or that................
Sorry for wasting everyone's time. it was just slow and I assumed it was stalling. guess I need to be more patient.
Hey Amy,
have you tried os.path.exists() instead of arpcy.exists(). This should work way faster 🙂
I think I am going to move over to os.walk too, arcpy.da.walk errors "does not exist" for too many files, over 50 percent.
thanks for the tip for the os.path.exists()