My walk code will not skip reading the worksheets inside excel files. Staff have excel files with a huge amount of worksheets that is slowing my walk down so that it is basically unusable. I think it is still reading the excel worksheets in the else statement.
import arcpy, os, traceback, sys arcpy.env.overwriteOutput = True workspace = r"C:\Users\Documents\GisData" arcpy.env.workspace = workspace try: walk = arcpy.da.Walk(workspace) txt = open(r"C:\Users\Documents\StaffGISLibrary.txt", 'w') for dirpath, dirnames, filenames in walk: if arcpy.Exists(dirpath): #describe = arcpy.Describe(dirpath) if dirpath.endswith(('.xls', '.xlsx', '.txt')): print "skipping excel file" pass else: for filename in filenames: fullpath = os.path.join(dirpath, filename) describe = arcpy.Describe(fullpath) print "writing " + fullpath txt.write(fullpath + "," + filename + "," + describe.dataType + "\n") else: print "DOES NOT EXIST" pass del filename, dirpath, dirnames, filenames txt.close() except Exception, e: pass # If an error occurred, print line number and error message import traceback, sys tb = sys.exc_info()[2] print "Line %i" % tb.tb_lineno print e.message finally: raw_input("Finished!")
Solved! Go to Solution.
The simplest fix is changing line 13 from pass to continue. You are using the wrong control flow statement. See More Control Flow Tools.
I tried it with continue too, same result
I was hoping the "simplest" fix would work. After seeing your reply and thinking about it, I realized why it doesn't work, or at least why your code is still slowing down when coming across Excel files.
As the ArcPy Data Access Walk documentation states, the standard method of forgoing a subworkspace is to modify the directory names list in place before the function starts stepping down into them.
When topdown is True, the dirnames list can be modified in-place, and Walk() will only recurse into the subworkspaces whose names remain in dirnames. This can be used to limit the search, impose a specific order of visiting, or even to inform Walk() about directories the caller creates or renames before it resumes Walk() again. Modifying dirnames when topdown is Falseis ineffective, because in bottom-up mode the workspaces in dirnames are generated before dirpath itself is generated.
Something along the lines of:
>>> walk = arcpy.da.Walk(workspace) >>> for dirpath, dirnames, filenames in walk: ... for dir in dirnames[:]: ... if dir.endswith(('.xls', '.xlsx', '.txt')): ... dirnames.remove(dir) ... for filename in filenames: ....
Make sure to iterate over a copy of dirnames (as done by dirnames[:] ) or modifying the list in place won't work.
Why would this print statement not work? (EOF error when using "\")
for dirpath, dirnames, filenames in walk: for d in dirnames: for f in filenames: print dirpath + d + "\" + f del d, f, dirpath, dirnames, filenames
Backslashes are escape characters in Python. Since you aren't escaping the escape character, you are likely creating a special character with one of your file names and causing an issue. The safer approach when building file system paths is to use Python's os.path functionality.