Python script to loop through folders and replace data sources running slow

01-09-2020 09:56 AM
New Contributor II


I have a python script that will loop through all folders and sub folders in a specified directory searching for mxd's with a specific data source. If the data source is found it will replace that data source with a new one.

The script I have works, but it is very slow. I did it on a test folder with 3 sub folders and a total of 9 maps. It took a little over 30 minutes. I am far from an expert on python scripting - is there anything anyone can see that I am doing wrong in my script that might cause it to run slow, or is there a better way to achieve what I'm trying to?

Thank you.

import arcpy, os, fnmatch, sys

def find(pattern, path=os.getcwd()):
    for path, dirs, files in os.walk(path):
        for filename in files:
            if fnmatch.fnmatch(filename, pattern): yield os.path.abspath(os.path.join(path, filename))

for mxdname in find('*.mxd',r'D:\GIS Projects\TEST'):
    print mxdname
    for lyr in arcpy.mapping.ListLayers(mxd):
        if lyr.supports("DATASOURCE"):
            if lyr.dataSource == r"C:\GIS\Cty\2017 Orthos\2017_Orthos.gdb\Orthos_2017":
                lyr.replaceDataSource(r"D:\GIS\Cty\Orthos\County_Orthos.gdb", "FILEGDB_WORKSPACE", "County_Orthos")
       = "County_Orthos"
                print 'Replaced Orthos'
    del mxd
0 Kudos
4 Replies
MVP Regular Contributor

The slowness is not coming from your logic but rather the time it takes for ArcPy to "open" each MXD and look at all the data sources. If you have any missing data sources, that will cause the MXD to open more slowly. Also, if there are just a lot of layers from many different sources, that will also cause it to open slowly.

0 Kudos
MVP Esteemed Contributor

Blake Terhune‌, you are absolutely correct.  I have been asking Esri for years, so many I can't recall exactly, to allow an option to open an MXD without "touching" every data source.  What I have done in the past when I have a large number of MXDs to re-source, I actually disconnect from any network and then run the code.  If the machine isn't connected to any networks, all of the attempts by ArcMap to connect to a data source fail instantly, which results in the MXD opening up really quickly.

0 Kudos
New Contributor II

Thanks Joshua. I will also point out that before I was using this script I was using a script that would search each mxd for the specific layer name, remove the layer, and then add the new layer which had been saved out as a layer file. This process was significantly faster (1-2 minutes per map file vs. 3-5 for replaceDataSource), but the AddLayer function only allowed me to add the layer at the top, bottom, or default in the table of contents.

This was problematic because the layer was part of a group layer in some maps, and I didn't want to take it out of the group layer. I also played around with searching for the group layer, removing it, and adding a new group layer that I had saved out as a layer file. However, there was too much variation between all the map files and I didn't want to try and figure out all the different possibilities to script out.

0 Kudos
New Contributor II

Thanks Blake. Most of the maps being updated have hundreds of layers, and I had hundreds of maps to update as well. I was just curious if I had done something wrong in the script that could cause the issue. 

0 Kudos