Select to view content in your preferred language

Print files in folder as ArcCatalog sees them

1216
5
Jump to solution
01-12-2023 01:51 PM
AlfredBaldenweck
MVP Regular Contributor

Forgive me if I've asked this before.

Does anyone have a script written to return all the files in folder as Catalog sees them?

I can use arcpy.ListFiles(), but I don't want to see all the components that make up a shapefile, I just want to see the shapefile. 

Similarly, I could use arcpy.ListFeatureClasses() or ...Tables, etc., but that would ignore non-GIS files.

For example, how could I cleanly return the contents of this folder, more or less as I see it here?

AlfredBaldenweck_0-1673560121727.png

Compare to file explorer:

AlfredBaldenweck_1-1673560252897.png

Thanks!

0 Kudos
2 Solutions

Accepted Solutions
by Anonymous User
Not applicable

You shouldn't remove items from an Array that you are iterating over, unless you are going in reverse.  It is skipping those .shx, .prj, .sbn, .cpg extensions because the statement for f in TList just visits each item in the list in order: TList[0], then TList[1], then TList[2], and so on until it runs out of items.

Removing TList[0] shifts all the other items in the list to the left one slot; the original TList[1] is now TList[0], so the for loop will skip over it.

You can use reverse:

 

for f in reversed(TList):
    # print(os.path.splitext(f)[1])
    if os.path.splitext(f)[1] in ignoreExt:
        # print(f)
        TList.remove(f)

 

View solution in original post

JeffreyHolycross
Esri Contributor

I think the issue may be that you're modifying TList while trying to iterate over it.  If you print(f) in every iteration, you'll see that not all of the original items are printed. You should create a new list and append to instead of modifying the existing list.

View solution in original post

0 Kudos
5 Replies
DavidSolari
Frequent Contributor

You can try something like this:

 

from os import path
def getFiles():
    def fname(x): return path.splitext(path.basename(x))[0]
    gisFiles = arcpy.ListFeatureClasses() + arcpy.ListTables() + arcpy.ListRasters()  # And so on so forth.
    gisBaseNames = {fname(g) for g in gisFiles}
    otherFiles = [f for f in arcpy.ListFiles() if fname(f) not in gisBaseNames]
    return sorted(gisFiles + otherFiles)

 

The catch is if you have, say, "mydata.shp" and "mydata.xlsx" in the same folder you'll lose the xlsx file. The fix for that is to build up a list of known special GIS extensions and split those files out, but that's a bit harder to write up so this should be a starting point at least.

0 Kudos
by Anonymous User
Not applicable

Just add the types you want to see to the files list. This will get files that are named the same but with different extensions as well.

 

files = ['.shp', '.txt', '.xlsx', '.csv', ...]
filteredFiles = [f for f in os.listdir(r'your path') if os.path.splitext(f)[1] in files]

 

If you wanted gdb's, you can use the  for root, dir, files in os.walk(): method and filter the dir if it contains .gdb in the name. 

0 Kudos
AlfredBaldenweck
MVP Regular Contributor

I'm going for a variation on this method, but I'm running into a weird issue. It's ignoring some of the shapefile parts when I'm telling it to remove them.

See here:

 

for r in rootL:
    arcpy.env.workspace = r
    ignoreExt = ['.shp', '.shx', '.dbf', '.prj', '.xml', '.sbn', '.sbx', '.cpg', '.aux']

    TList= ['test.shp', 'test.shx', 'test.dbf', 'test.prj', 'test.xml', 'test.sbn', 'test.sbx', 'test.cpg', 'test.aux' ]
    
    for f in TList:
        #print(os.path.splitext(f)[1])
        if os.path.splitext(f)[1] in ignoreExt:
            #print(f)
            TList.remove(f)
            
    print(TList)
    # ['test.shx', 'test.prj', 'test.sbn', 'test.cpg']

 

 Why are these files still in the list?

 

 

Edit: Apparently splitext() is ignoring those files. What's weirder is that it doesn't ignore it if you feed them to it directly; os.path.splitext('test.shx')[1] will correctly yield ".shx"

0 Kudos
JeffreyHolycross
Esri Contributor

I think the issue may be that you're modifying TList while trying to iterate over it.  If you print(f) in every iteration, you'll see that not all of the original items are printed. You should create a new list and append to instead of modifying the existing list.

0 Kudos
by Anonymous User
Not applicable

You shouldn't remove items from an Array that you are iterating over, unless you are going in reverse.  It is skipping those .shx, .prj, .sbn, .cpg extensions because the statement for f in TList just visits each item in the list in order: TList[0], then TList[1], then TList[2], and so on until it runs out of items.

Removing TList[0] shifts all the other items in the list to the left one slot; the original TList[1] is now TList[0], so the for loop will skip over it.

You can use reverse:

 

for f in reversed(TList):
    # print(os.path.splitext(f)[1])
    if os.path.splitext(f)[1] in ignoreExt:
        # print(f)
        TList.remove(f)