Select to view content in your preferred language

Tool or script to find all Shapefiles with bad naming conventions in an APRX?

340
2
06-25-2025 11:59 AM
Labels (2)
JoeBryant1
Frequent Contributor

As my organization transitions to using Pro for all of our reports, we are bringing along a lot of legacy shapefile and shapefile based workflows. Many of our modelers and scientists use software or python scripts that can only create shapefiles, so we will continue to reference these shapefiles in our ArcGIS Pro projects when creating our reports. Python specifically allows users to create unsupported file and field names with no check to see if they are valid.

Pro will open and display most shapefiles even when they have file names that are too long, with unsupported characters in their file names as well as field names. It is not until you get to a particular query/symbology/editing or geoprocessing process that you run into issues, and often the error message produced is generic and does not let you know that bad shapefile naming practices are to blame. For instance, the Project Package tool that zips up an APRX along with all of the files referenced in the maps ("Share Outside Organization") will fail with "General Function Failure" when your project references shapefiles with bad naming, even though the project was functioning normally.

Is there currently a way, possibly using Python, to scrape all Data Sources in an APRX and to see if any are shapefiles with unsupported file or field names?

I'd like this tool to produce a list of the files at the very least. Ideally it would also tell a user what specifically is unsupported. The logical next tool would take these files and import them into the default project geodatabase, hopefully correcting any of the unsupported naming in the process.

0 Kudos
2 Replies
BrennanSmith1
Frequent Contributor

Yeah, this can be done with arcpy.  The general flow would be something like this:

import os
import arcpy

aprx = arcpy.mp.ArcGISProject('CURRENT')
# for all maps in the project
for _map in aprx.listMaps(): 
    # for all layers in the map   
    for _layer in _map.listLayers():
        #if the layer is a shapefile
        if _layer.supports("DATASOURCE"):             
            if _layer.dataSource.endswith('.shp'):
                #check if the shapefile name is valid
                shpPath = _layer.dataSource
                shpName = os.path.splitext(os.path.basename(shpPath))[0]
                shpValid = arcpy.ValidateTableName(shpName,shpPath)
                if shpName != shpValid:
                    #print out any issues
                    print('Shapefile Name "{}" invalid, use "{}" instead'.format(shpName,shpValid))
                #for each field in the shapefile
                for _field in arcpy.ListFields(shpPath):   
                    #check if field name is valid
                    fldName = _field.name
                    fldValid = arcpy.ValidateFieldName(fldName, shpPath)
                    if fldName != fldValid:  
                        #print out any issues
                        print('Shapefile "{}" Field Name "{}" invalid, use "{}" instead'.format(shpName,fldName,fldValid))

 

0 Kudos
ZacharyUhlmann1
Frequent Contributor

Hey Joe,

For a start, use the Arcpy.mp module.  Here is the gist (to get you started; you can expand and tailor as needed) of what I do:

aprx = arcpy.mp.ArcGISProject('current')
lyt_list = aprx.listLayouts()
lyt_name, ds_list, lyr_name, map_element, map_name = [], [], [], [], []
for lyt in lyt_list:
    el = [e for e in lyt.listElements() if e.type == 'MAPFRAME_ELEMENT']
    for em in el:
        for lyr in em.map.listLayers():
            if lyr.visible:
                lyr_name.append(lyr.name)
                lyt_name.append(lyt.name)
                map_element.append(em.name)
                map_name.append(em.map.name)
                try:
                    ds_list.append(lyr.dataSource)
                except AttributeError:
                    ds_list.append('NA')
            else:
                pass

df = pd.DataFrame(np.column_stack([lyt_name, map_element,map_name, lyr_name, ds_list]),
                  columns = ['layout','map_element','map_name','layer','source'])
df.to_csv(r'path\to\lyR_inventory.csv')

this will result in a csv saved in the location from

df.to_csv(path/to/lyR_inventory.csv)

I have a Python Toolbox built already that you are free to use:  pro_project_utils (Python Toolbox) 

  • The first tool will inventory all Layouts in a Project (aprx) and which maps are used in it.
  • The second tool is what you want, it inventories all layers in all maps contained in a Project (aprx).

Screenshot 2025-06-30 140638.png

I believe I can document the arguments, etc to display in the toolbox so you can click and determine arguments and outputs.  If you intend to use, let me know and I will update on GitHub.  But in general here are the arguments for Layer Inventory for Map Elements (second tool).

  • Click "Current Pro Document" if you are using tool within the Project to inventory
  • Otherwise, provide path/to/aprx in Other Pro-Document argument
  • path/to/...formatted table will be a DIRECTORY where inventories (.csv) will be saved.  Note, if Unique Layer Inventory is checked, then there will be two inventories (see last bullet)
  • File Name can be a <filename>.csv or simply <filename> and the output will be a .csv
  • Unique Layer Inventory fairly certain this will have one row per feature/.shp/.tif/file visible (i.e. checked to diplay) in Table of Contents for every map used in a Layout in specified project.  Note that in this situation, TWO .csv will be exported, one with unique layers and one with all layer listed for each map.

Screenshot 2025-06-30 140647.png

 

 

 

 

 

Below is what the lyR_inventory.csv will look like.  This is a curated subset.

  • layout = name of layout
  • map_name = map name
  • layer = layer name from Table of Contents
  • source = this is what you want (!!!!!) - the actual Data Source.  Here is where you can further view and identify problematic shapefiles, feature classes, etc.
  • map_element = the mane of map element in the layout

Screenshot 2025-06-30 142607.png

 

 

layoutmap_elementmap_namelayersource
fall_cr_site_visit_march2023_private_propMap Framefall_cr_visit_march2023shasta_trailC:\Documents\20250627_shasta_trail\shasta_trail.shp
fall_cr_site_visit_march2023_private_propMap Framefall_cr_visit_march2023access_routes_v3C:\other_folder\mapping.gdb\access_routes\access_routes_v3
keno_parcelsMap Framekeno_parcelsparcels_master_rectified_formattedC:\other_folder\mapping.gdb\base_layers\parcels_master_rectified_formatted
keno_parcelsMap Framekeno_parcelsWorld Imageryhttps://services.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer
shasta_trail_workMap Frame 1Mapshasta_trailC:\Documents\20250627_shasta_trail\shasta_trail.shp
shasta_trail_workMap Frameshasta_trail_work_2025City of YrekaC:\other_folder\\mapping.gdb\legend\legend_item_poly
shasta_trail_workMap Frameshasta_trail_work_2025lkp_master_labelsC:\other_folder\\master.gdb\labels\lkp_master_labels
shasta_trail_workMap Frameshasta_trail_work_2025World Imageryhttps://services.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer
jcb_parcels_blmMap Framekeno_parcelsparcels_master_rectified_formattedC:\other_folder\\mapping.gdb\base_layers\parcels_master_rectified_formatted
jcb_parcels_blmMap Framekeno_parcelsparcels_master_rectified_formattedC:\other_folder\\mapping.gdb\base_layers\parcels_master_rectified_formatted
jcb_parcels_blmMap Framekeno_parcelsWorld Imageryhttps://services.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer
fall_cr_site_visit_march2023Map Framefall_cr_visit_march2023shasta_trailC:\Documents\20250627_shasta_trail\shasta_trail.shp
fall_cr_site_visit_march2023Map Framefall_cr_visit_march2023access_routes_v3C:\other_folder\\mapping.gdb\access_routes\access_routes_v3
fall_cr_site_visit_march2023Map Framefall_cr_visit_march2023Lake_SurfacesC:\other_folder\\master.gdb\hydrology\Lake_Surfaces

 

0 Kudos