loop down through a directory tree.

8268
19
07-02-2014 09:01 AM
PaulHuffman
Occasional Contributor III
I want to loop through a folder and clip all feature classes by a AOI poly and write the results to an output folder.  How to I also have the loop search all subfolders for vectors and raster feature classes as well?
Tags (2)
0 Kudos
19 Replies
BruceNielsen
Occasional Contributor III

You could use a list comprehension on line 8 to create a full path to each of the rasters:

rasterList.append([os.path.join(arcpy.env.workspace, A) for A in arcpy.ListRasters("*", "GRID")])

PaulHuffman
Occasional Contributor III

That produces an error "TypeError: 'NoneType' object is not iterable"   What does that mean?  Is the "*" the 'NoneType'?

0 Kudos
PaulHuffman
Occasional Contributor III

I'm about ready to tell the client that we're going to run this in ArcGIS 10.2.  da.Walk makes this so easy.  This script uses da.walk and handles the tifs and grids in the tree, except I hit one grid that had a file name so long I couldn't add any extra letters.

#testcliprasterda.py

#test trying to walk a directory tree, find rasters and shapefiles, and clip to a area of Interest (AOI)

# using da.walk first available at ArcGIS 10.1

# Paul Huffman, Yakama Fisheries, 7/7/2014

# Import arcpy module

import arcpy

import os

#arcpy.CheckOutExtension("spatial")

# Set Geoprocessing environments

arcpy.env.scratchWorkspace = "c:\\avdata\\PythonTest\\ClipLidar\\Scratch"

#arcpy.env.workspace = "c:\\avdata\\PythonTest\\ClipLidar\TestData"

arcpy.env.workspace = "c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input\\Highest_Hit"

arcpy.env.overwriteOutput = True

clipshape = "c:\\avdata\\PythonTest\\ClipLidar\\AOIPoly_SP.shp"

InputFolder = "c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input"

#InputFolder = "c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input\\Highest_Hit"

#OutputFolder = "c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Output"

#Find AOI extent

AOI = arcpy.mapping.Layer(clipshape)

AOIextent=AOI.getExtent()

#rasterList=[]

#Find the input feature classes

#for root, dirs, files in os.walk(InputFolder):

#walk = arcpy.da.Walk(InputFolder, topdown=True, datatype="RasterDataset")

for root, dirs, files in arcpy.da.Walk(InputFolder, topdown=True, datatype="RasterDataset"):

    for name in files:

        FileObj = arcpy.mapping.Layer(os.path.join(root, name))

        FileExtent = FileObj.getExtent()

        print "Processing: " + name

       #if image overlap or touches AOI polygon,  then clip image, write to output

        Extent_Disjoint = str(FileExtent.disjoint(AOIextent))

        if Extent_Disjoint == 'False':

            print "Clipping: " + name

            OutputPath = root.replace("Input","Output")

            if not os.path.exists(OutputPath):

                os.makedirs(OutputPath)

            #Handle tifs

            if os.path.splitext(name)[1] == ".tif":   

                Clipname = os.path.splitext(name)[0] + "cp.tif"

            else:

                #Clipname = os.path.splitext(name)[0] + "c"

                Clipname = os.path.splitext(name)[0]

            print "Writing " + (os.path.join(OutputPath,Clipname))

            arcpy.Clip_management((os.path.join(root, name)),"#",(os.path.join(OutputPath,Clipname)),clipshape,"#","ClippingGeometry")

0 Kudos
BruceNielsen
Occasional Contributor III

What that probably means is that the list was empty; no GRIDs in the folder. You may need to make it a little smarter:

rasterInDir = arcpy.ListRasters("*", "GRID")

if len(rasterInDir) > 0:

    rasterList.append([os.path.join(arcpy.env.workspace, A) for A in rasterInDir])

PaulHuffman
Occasional Contributor III

I'm starting to get confused.  Using Bruce's last code, now I'm getting subdirectories to my InputFolder added to my rasterList.  And I'm missing the last slash in the path name on all of them.  Grids are only found in C:\avdata\PythonTest\ClipLidar\TestData\Input\Highest_Hit

>>> ================================ RESTART ================================

>>>

c:\avdata\PythonTest\ClipLidar\TestData\InputHighest_Hit

c:\avdata\PythonTest\ClipLidar\TestData\InputOrthophotos

c:\avdata\PythonTest\ClipLidar\TestData\Inputvectors

c:\avdata\PythonTest\ClipLidar\TestData\Input\Highest_Hitellen_n_hh

c:\avdata\PythonTest\ClipLidar\TestData\Input\Highest_Hithillsha_elle1

c:\avdata\PythonTest\ClipLidar\TestData\Input\Highest_Hitinfo

c:\avdata\PythonTest\ClipLidar\TestData\Input\Highest_Hitmt_clifty_hh

c:\avdata\PythonTest\ClipLidar\TestData\Input\Highest_Hitraven_hh

c:\avdata\PythonTest\ClipLidar\TestData\Input\Highest_Hittaneum_hh

c:\avdata\PythonTest\ClipLidar\TestData\Input\Highest_Hitteanaway_hh

c:\avdata\PythonTest\ClipLidar\TestData\Input\Highest_Hity5012010-2000

>>>

0 Kudos
PaulHuffman
Occasional Contributor III

Less confused now.  What I was seeing was the output of the first print statement, unnoticed at first, now commented out, but it showed me that I needed to add '\\" into the string.  Now I'm getting a rastList that I can work with:

InputFolder = "c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input"

rasterList=[]

#Find the input feature classes

for root, dirs, files in os.walk(InputFolder):

    for targetdir in dirs:

        #print root + "\\" + targetdir

        targetworkspace = root + "\\" + targetdir

        arcpy.env.workspace = targetworkspace

        #rasterList.append (arcpy.ListRasters("*", "GRID"))

        rasterInDir = arcpy.ListRasters("*", "GRID")

        if rasterInDir > 0:

            rasterList.append([os.path.join(arcpy.env.workspace, A) for A in arcpy.ListRasters("*", "GRID")])

for raster in rasterList:

        print raster, "\n"

0 Kudos
PaulHuffman
Occasional Contributor III

I'm still confused.  It looks like my full path rasters are in the list twice, along with some [] elements.

>>> for raster in rasterList:

    print raster

   

[u'c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input\\Highest_Hit\\ellen_n_hh', u'c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input\\Highest_Hit\\hillsha_elle1', u'c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input\\Highest_Hit\\mt_clifty_hh', u'c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input\\Highest_Hit\\raven_hh', u'c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input\\Highest_Hit\\taneum_hh', u'c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input\\Highest_Hit\\teanaway_hh', u'c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input\\Highest_Hit\\y5012010-2000']

[]

[]

[u'c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input\\Highest_Hit\\ellen_n_hh\\ellen_n_hh']

[u'c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input\\Highest_Hit\\hillsha_elle1\\hillsha_elle1']

[]

[u'c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input\\Highest_Hit\\mt_clifty_hh\\mt_clifty_hh']

[u'c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input\\Highest_Hit\\raven_hh\\Band_1']

[u'c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input\\Highest_Hit\\taneum_hh\\taneum_hh']

[u'c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input\\Highest_Hit\\teanaway_hh\\teanaway_hh']

[u'c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input\\Highest_Hit\\y5012010-2000\\y5012010-2000']

0 Kudos
PaulHuffman
Occasional Contributor III

I got this version that is working for Grids.  But it will bomb if it finds a second folder under the input starting point containing grids.  I'm detecting  grids with by looping through workspaces and adding grids to the list if the number of grids counted is > 0,  lines 28 to 34.  But the folder above the Grids is always included in this list,  so I delete it out again with line 38.  Very crude!

#testclipgrids.py

#test trying to walk a directory tree, find grids, and clip to a area of Interest (AOI)

# and make it run in 10.0 so without da.walk.

# Paul Huffman, Yakama Fisheries, 7/14/2014

# Import arcpy module

import arcpy

import os

#arcpy.CheckOutExtension("spatial")

# Set Geoprocessing environments

arcpy.env.scratchWorkspace = "c:\\avdata\\PythonTest\\ClipLidar\\Scratch"

#arcpy.env.workspace = "c:\\avdata\\PythonTest\\ClipLidar\TestData"

#arcpy.env.workspace = "c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input\\Highest_Hit"

arcpy.env.overwriteOutput = True

clipshape = "c:\\avdata\\PythonTest\\ClipLidar\\AOIPoly_SP.shp"

InputFolder = "c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input"

#Find AOI extent

AOI = arcpy.mapping.Layer(clipshape)

AOIextent=AOI.getExtent()

#Intialize empty lists

rasterList=[]

rasterList1=[]

#Find the input feature classes

for root, dirs, files in os.walk(InputFolder):

    for targetdir in dirs:

        targetworkspace = root + "\\" + targetdir

        arcpy.env.workspace = targetworkspace

        rasterInDir = arcpy.ListRasters("*", "GRID")

        if len(rasterInDir) > 0:

            rasterList.append(arcpy.env.workspace)

for raster in rasterList:

        print raster

del rasterList[0]

#start comparing extents and clipping

for name in rasterList:

    FileObj = arcpy.mapping.Layer(name)

    FileExtent = FileObj.getExtent()

    print "Processing: " + name

   #if image overlap or touches AOI polygon,  then clip image, write to output

    Extent_Disjoint = str(FileExtent.disjoint(AOIextent))

    if Extent_Disjoint == 'False':

        print "Clipping: " + name

        foldername = os.path.dirname(name)

        OutputPath = foldername.replace("Input","Output")

        #print "name = " +name

        #print "OutputPath =" + OutputPath

        if not os.path.exists(OutputPath):

            os.makedirs(OutputPath)

        Clipname = os.path.splitext(name)[0]

        Clipbasename = os.path.basename(Clipname)   

        #print "Clipname = " + Clipname

        print "Writing " + (os.path.join(OutputPath,Clipbasename))

        arcpy.Clip_management(name,"#",(os.path.join(OutputPath,Clipbasename)),clipshape,"#","ClippingGeometry")

0 Kudos
PaulHuffman
Occasional Contributor III

There is still something wrong with the way arcpy.ListRasters works to detect grids.  Maybe not wrong, but unexpected.  Line 12's output shows that ListRasters still finds my grids in their parent workspace and also the subdirectory for the grid that contains the info external files.

The code segment:

#Find the input  Grid feature classes

for root, dirs, files in os.walk(InputFolder):

    for targetdir in dirs:

        targetworkspace = root + "\\" + targetdir

        arcpy.env.workspace = targetworkspace

        rasterInDir = arcpy.ListRasters("*", "GRID")

        if len(rasterInDir) > 0:

            for raster in rasterInDir:

                rasterList.append(targetworkspace + "\\" + raster)

           

for raster in rasterList:

        print raster

The output of the print statement, in part:

c:\avdata\PythonTest\ClipLidar\TestData\Input\BareEarth\hillsha_tane1

..

..

c:\avdata\PythonTest\ClipLidar\TestData\Input\BareEarth\hillsha_tane1\hillsha_tane1

0 Kudos
PaulHuffman
Occasional Contributor III

Fixed it.  This is what satisfied the colleague, and was put into production. Lines 30-34 is how I check to see if the current worksheet is a Grid or a workspace containing grids.  Had to go through this extra BS because I couldn't use da.walk.

#testclipgridstifsshapes.py

#test trying to walk a directory tree, find grids, and clip to a area of Interest (AOI)

# and make it run in 10.0 so without da.walk.

# Paul Huffman, Yakama Fisheries, 7/21/2014

# Import arcpy module

import arcpy

import os

# Set Geoprocessing environments

#arcpy.env.scratchWorkspace = "c:\\avdata\\PythonTest\\ClipLidar\\Scratch"

arcpy.env.overwriteOutput = True

clipshape = "c:\\avdata\\PythonTest\\ClipLidar\\AOIPoly_SP.shp"

InputFolder = "c:\\avdata\\PythonTest\\ClipLidar\\TestData\\Input"

#find input folder bottom folder

InputString = InputFolder.split("\\")[-1]

#Find AOI extent

AOI = arcpy.mapping.Layer(clipshape)

AOIextent=AOI.getExtent()

#Intialize empty lists

rasterList=[]

#Find the input  Grid feature classes

for root, dirs, files in os.walk(InputFolder):

    for targetdir in dirs:

        targetworkspace = root + "\\" + targetdir

        #if the targetworkspace can be initialized as a Raster object, do nothing else

        try:

            arcpy.Raster(targetworkspace)

         #if the targetworkspace fails when an attempt is made to initialize it as a raster object, search for rasters within it

        except:

            arcpy.env.workspace = targetworkspace

            rasterInDir = arcpy.ListRasters("*", "GRID")

            if len(rasterInDir) > 0:

                for raster in rasterInDir:

                    rasterList.append(targetworkspace + "\\" + raster)

           

##for raster in rasterList:

##        print raster

#start comparing extents and clipping

for name in rasterList:

    FileObj = arcpy.mapping.Layer(name)

    FileExtent = FileObj.getExtent()

    #print "Processing: " + name

   #if image overlap or touches AOI polygon,  then clip image, write to output

    Extent_Disjoint = str(FileExtent.disjoint(AOIextent))

    if Extent_Disjoint == 'False':

        print "Clipping: " + name

        foldername = os.path.dirname(name)

        OutputPath = foldername.replace(InputString,"Output")

        if not os.path.exists(OutputPath):

            os.makedirs(OutputPath)

        if len(os.path.basename(name)) > 10:   

            Shortname = (os.path.basename(name))[0:11]

            print "Grid name shortened to " + Shortname

            Clipname = os.path.splitext(Shortname)[0] + "cp"

        else:   

            Clipname = os.path.splitext(name)[0] + "cp"

        Clipbasename = os.path.basename(Clipname)

        print "Writing " + (os.path.join(OutputPath,Clipbasename))

        arcpy.Clip_management(name,"#",(os.path.join(OutputPath,Clipbasename)),clipshape,"#","ClippingGeometry")

#Find the input tifs and shapefiles

for root, dirs, files in os.walk(InputFolder):

    for name in files:

        if os.path.splitext(name)[1] == ".tif":

            FileObj = arcpy.mapping.Layer(os.path.join(root, name))

            FileExtent = FileObj.getExtent()

            #print "Processing: " + name

           #if image overlap or touches AOI polygon,  then clip image, write to output

            Extent_Disjoint = str(FileExtent.disjoint(AOIextent))

            if Extent_Disjoint == 'False':

                print "Clipping: " + name

                OutputPath = root.replace(InputString,"Output")

                if not os.path.exists(OutputPath):

                    os.makedirs(OutputPath)

                Clipname = os.path.splitext(name)[0] + "cp.tif"

                print "Writing " + (os.path.join(OutputPath,Clipname))

                arcpy.Clip_management((os.path.join(root, name)),"#",(os.path.join(OutputPath,Clipname)),clipshape,"#","ClippingGeometry")

        #for shapefiles, use a different clip, write to output.

        if os.path.splitext(name)[1] == ".shp":

            print(os.path.join(root, name))

            FileObj = arcpy.mapping.Layer(os.path.join(root, name))

            FileExtent = FileObj.getExtent()

            Extent_Disjoint = str(FileExtent.disjoint(AOIextent))

            if Extent_Disjoint == 'False':

                print "Clipping: " + name

                OutputPath = root.replace(InputString,"Output")

                if not os.path.exists(OutputPath):

                    os.makedirs(OutputPath)

                Clipname = os.path.splitext(name)[0] + "cp"

                print "Writing " + (os.path.join(OutputPath,Clipname))

                arcpy.Clip_analysis((os.path.join(root, name)),clipshape,(os.path.join(OutputPath,Clipname)))

   

0 Kudos