Python addin for data inventory and “broken-link” repair.

7466
29
08-31-2015 03:20 PM
RebeccaStrauch__GISP
MVP Esteemed Contributor
11 29 7,466

Updated

2/7/2017  1/30/2017 - adding link to thread with info for finding/replacing graphic elements.  At some point I may add this functionality to the tool, but no time for that now.  But worth making note of possible solution:

   Python script for mass find and replace of workspace path? 

https://community.esri.com/message/664513-re-write-broken-source-list-to-text-file    and a simple script, if you do not want to use the addin

12/20/2016  -- my download and install instructions are a bit off.  Paul Davidson pointed out that it was not working.  If you download the file and it is called ChkandFixLinks.esriaddin.zip   you need to unzip it first, then double-click on the ChkandFixLinks.esriaddin  file.  Then it should work.  This is a toolbar for ArcCatalog (not ArcMap).  I just installed and tested with 10.5. (only tested the first 2 buttons to make sure it worked).

10/28/2015 New tool "Set Map Data Sources" provided by ArcGISTeam LocalGov   Have not tried, but worth a look

9/22/2015  4:45 pm (AK time)   --- removed extra quote in line 52 of the "fix" script that caused error. Updated attachment

CheckAndFixLinks.esriaddin

- Download esriaddin file   If downloads as a zip file, unzip so you see the .addin file.

- For Python addin, double click ChkandFixLinks.esriaddin to install  to ArcCatalog.

- if you prefer to see/modify scripts, rename the .esriaddin to .zip, the unzip.  Tool bar and scripts are available for viewing/editing.

At this time, this is the only download source.  May move to ArcScript 2.0 (when available) and/or may be interested in gitHub at some point.

ArcCatalog ToolBox – tested with 10.2.2/10.3.0  -- will create reports of all broken links, with option to repair all types of connections

Note: known issue, accessing mxd's saved as 10.3.x from the addin installed on a 10.2.2 machine.

This Toolbar can perform the following on a folder/subfolders (using walk):

  1. list your file geodatabases (FGDB), and approximate size on disk;
  2. inventory your features classes, all types including: FGDB, covers, grids, etc.;
  3. list broken links (based on machine/user running the tool) in MXDs;
  4. repair individual feature class broken links….including .sde and .ags connection, etc. (based on .csv file input)

The first three (left of Yield sign icon) create reports only (.txt, .csv, .xls)…they do not modify you mxds in anyway, so these may be nice tools to use even if you fix your links in another manner.

The Yield sign is to remind you that the input .csv file is required for input for the next tool (after the Yield). 

Note: For those that want a bulk drive-letter and/or <servername> change only, I removed this tool for this first release, but the code is there (same as 3a, and tool is shown in Toolbox in zip). I provided this, but suggest skipping the bulk change and using the other repair tool instead. (reason: bulk change uses findAndReplaceWorkspacePaths at the mxd level, and may change path for layers that need to be handled differently)

It is recommended that you create a copy of your mxd’s and do a few test runs to get familiar with the tools (because these do a “walk” do NOT place you backup copies in a subfolder of your working folder). I have an option to write the updated mxd’s to a new _repair folder, but I have this disabled at this time.  See the recommended workflow at the end of the document.

Notes and Cautions:

  • If you see this error: TypeError: GPToolDialog() takes at most 1 argument (2 given)” it can be ignore.  This is still arcpy bug (NIM089253)
  • The first three scripts only create lists with .csv, .xls, and/or .txt output files, so other than possibly taking a long time to run (if complex structure or slow system), they will not change any data. Most tools output a comma delimited (.csv), text (.txt), and/or an Excel (.xls) file.   By default, the YYYYMMDD_HHMM is appended to the name (startup timestamp for HHMM) so it is not overwritten on repeated runs, and the file(s) will be written to the folder being analyzed (you will need write-access to working drive to write the files).  You may need to eventually clean up (delete) these output files if you run the script often.
  • The Yield sign is a reminder that for the repair script, you need to modify a copy/modify the Excel/.csv file above, and save is as a .csv file for input to next script. When in doubt of a “new source”, do not enter a NewPath and leave NewType = “_review”.  Those old sources will not change.
  • The repair script (to right of the Yield sign) will change the mxd’s (if new source path in the .csv), so it is highly recommended that you backup your folders and mxd’s before running either one. Also, I suggest testing a few mxds in a safe location until you are comfortable with what each tool does  
  • ** Keep in mind that broken links are in “the eye of the beholder”, that is, broken based on the machine and user running the script, so when replacing paths, if you can use shared connection files, and or common path names, that will keep the mxd’s and data un-broken for multiple users.
  • If the mxd is storing “broken” user’s login credentials, running these programs may cause an unsuccessful login attempt and therefore lock the user (depending on your network setup). Keep this in mind so you can unlock the user as appropriate.
    • Caution
  • Some manual editing steps are required for to create the list to repair broken links tools.
  • If you have multiple users using the same mxd, consider using common mapping and or connection file. Store the connection files in a common location that is mapped the same for all.
  • Data will also echo messages in the Results table so you can track progress….but this can also create a very large result output. It is recommended that you “remove” the result output once the script is complete and you are done reviewing it the Result tab. (Leaving these large result output listings can significantly slow the opening and closing of ArcCatalog)
  • If Catalog closes before the script is complete, file will not be written to.

Tools (note: all tools use “walk” to include folder and subfolders😞

  • 1-List FGDB size on disk - to .csv .xls files (ListFGBsize.py)
    • Input arguments:
      • theWorkspace: drive or folder to walk through
      • outfile: default is GDBLIST, will append YYMMDD_HHMM and extensions
    • Field names in output: Name, GDBpath, and ApproxMB.

Default output name is GDBlist with a date-time (YYYYMMDD_HHMM) appended to the basename to keep output unique for repeated script execution.

  • 2-Inventory FC reports, .csv AND text output   (InventoryFC.py)
    • Originally had the two separate script for the outputs, but combined since 99% was exactly the same (and could write to two files in one pass)
    • For a given folder, identifies and creates list of a feature classes, including FGDB, covers, and grids
    • Outputs two files
      • .csv (common delimited) format
        • FType – a class name assigned by me, for example:
          • ArcInfoTable
          • CoverageFeatureClass
          • FeatureClass
          • (add raster sample)
          • FCname
            • Table of FC name if in a GDB
            • Arc (line), point, label, polygon if a coverage
            • (add raster sample)
            • FullPath
              • For file geodatabase, thru .fgb
              • For cover tables, thru “covers” folder
              • For covers, thru coverage name
          • .txt (very basic, report format – easier to visualize)
            • A couple header lines,
              • “List of all GIS data in <folder>  on <MM/DD/YYYY>
                Includes coverages (pts, poly, arc, anno), shapes, and FGDB data.
                -----------------------------------------------------“
              • followed by list of FGDB/workspace/folder; my featureclass tag, as shown above, and the features class files within them (indented for easy reading)
          • Neither of these files is currently a unique list, and some folders ( especially coverages) are repeated…may change this to be unique at some point, but not high priority
  • 3a-Create Unique list of Broken Links (with 3b has option fix drive letters first)
    • Creates csv, Excel (xls), and option FGDB (although I do not use, have not found use for this yet) of unique broken links within all mxds within the folder/subfolders.
      • Option removed for this release…3B OPTION: to repair drive letter changes before running. CAUTION: using this option will use findAndReplaceWorkspacePaths at the mxd level and may not be what you need….make sure you have a back up first.
    • Output formats
      • .csv (comma delimted) format (default)
      • .xls (Excel)
      • .txt report
      • Option: FGDB table
    • Output fields:
      • UniqID – auto incremented number, just to make it easier
      • dataType – a tag I assign to help identify source type, e.g. Fgdb, MapServer_connection, Table_other, etc.
      • newType – “__review”, text to remind you it needs to be reviewed for possible correction
      • brokenPath – self explanatory
      • newPath – self explanatory
  • 4-Repair broken link source (4_DataSourceRepairX.py)
    • Updates source paths of broken links, based on input .csv file
    • Input:
      • Folder to process (will also walk thry subfolders)
      • .csv file with newType and newPath updated
    • Outputs: CAUTION overwrites mxd, so made sure you have a copy in a location that is NOT in of below the folder that you will be processing (script has ability to use SaveAs, but not currently activated in tool)

Suggested workflow:

  1. Run script #1 and #2 to get a feel the data in you folder
  2. If you haven’t already, create backup/copy of the folder you will be working with.
  3. Run #3a (broken list, “without updates”) to find all broken links. 
  4. Review the output .xls and/or .csv.  Suggest making a copy of which ever is easier for you to work in.  They have the same info, just in two different formats. Suggested new name RepairBrokenLinks).
    • Caution: if/when sorting, make sure you have data selected.
    • I suggest you initially sort by datatype and remove all the “Group” and “Event” rows…those are for info only.
    • You may then want to sort by BrokenPath so you can find any pattern that may need to change to the same new source. 
      • For example, John had a source mapped as “d:\” , while Jane had it mapped as “f:\”  --- both show as broken links and you now want it to be mapped as a UNC path.  Add the path to newPath. If same data type, dupe value in dataType to newType … if changed, change newType as appropriate.
    • For changing SDE, I found creating new connections and saving them to a common location worked best.
    • For changing ArcGIS Services, I create a layer (.lyr) file and save it so a common location. The actually require the current connection be dropped, and the new layer be added (no replace workspace will work).
    • These are current data types the program can handle
      • cover_arc
      • cover_pont
      • cover_poly
      • cover_region
      • cover_tic
      • shape
      • fgdb
      • pgdb
      • sde
      • dbf
      • table_other
      • table_dat
      • txt
      • raster – may be SDE raster layers, or older NGS-TOPO! .tpq rasters
      • raster.jpg
      • raster.bmp
      • raster.gif
      • raster.jpg
      • raster.sid
      • raster.tif
      • service_<your AGS service name>
      • esri.sdc - this is for information only, no repair included in script….these records should be deleted before running fix
      • other – these may be coverages that could not be classified
      • group – this is for information only….these records should be deleted before running fix
      • events_table – this is for information only….these records should be deleted before running fix
      • _unknown  – this is for information only and not sure what these are. For me, listed .mxd name and may be those with .SDE issues….these records should be deleted before running fix
      • newType - “_review” until modified by user
      • brokenPath – broken data source path found in mxd
      • newPAth – empty until modified by user
  5. For any broken link that need further review (i.e. not read to change to a newPath), leave the newType as “_review” and the newPath blank.
  6. Once ready, run the#4 to repair the broken links.
  7. Once the repair is complete, run #3a again to get a new list of broken links remaining.
  8. Repeat #3a and #4 as needed.
29 Comments
MichaelMiller2
Regular Contributor

Excellent Tool Rebecca!!

RebeccaStrauch__GISP
MVP Esteemed Contributor

Thanks Michael.....didn't even know if anyone had downloaded and tried it yet.  Nice to know it's helpful.  A bit complicated to get the last step setup with all the new path, but get it setup once and you can use the same .csv repeatedly if needed.  It's working well for us.

CatherineEscarpeta
New Contributor

great tool - thank you for sharing with us

lanadonaldson1
New Contributor III

Well now i'm excited to get this going...however i'm getting the error "No GUI components found in this Add-In." repeatedly when trying to load the add-in. I researched quickly there doesn't seem to be much helpful info....any ideas?

RebeccaStrauch__GISP
MVP Esteemed Contributor

did you remove the .zip extension to .addin and then double-click?  I guess I should also ask what version of ArcGIS Desktop you are using (should work for 10.2.x/10.3.x) and runs in ArcCatalog (not arcmap)

User35489
Occasional Contributor II

Hi Rebecca,

Thanks for excellent tool. Do you know any addin which helps in fixing/updating user login credentials(updating new/changed password) for mxds?

Thank you

-AS

RebeccaStrauch__GISP
MVP Esteemed Contributor

Abdullah,

Glad you are finding it useful.

I have not seen, nor have I tried anything for updating credentials.  One option, I would guess, is to create a new connection and then replace the "source". ??    For ArcSDE, we use domain accounts now and do not store user/pass and that seems to be ok.  For ArcGIS Server connections in an mxd, same thing.

However, t might be worth asking that as a new question.  I would suggest in the Managing Data​  as a start.  I'm assuming you are talking about user/pass credentials for SDE?  So maybe also in Geodatabase

User35489
Occasional Contributor II

Greetings Rebecca,

Any suggestion for GROUP LAYER data source fixing?

Thanks

-AS

User35489
Occasional Contributor II

Greetings Rebecca,

Any suggestion for GROUP LAYER data source fixing?

Thanks

-AS

RebeccaStrauch__GISP
MVP Esteemed Contributor

I have a version that I was updating to include more options, but it's stuck on hold right now.  But looking at my code, it looks like I am skipping over lyr.isGroupLayer and "Events" because, if I remember correctly, there isn't a path for "Grouped" items....and events are really just a temp pointer.

So, sorry, no suggestions on GROUP LAYER.

PaulDavidson1
Regular Contributor

Was starting to work on just such a tool and thought, this must have been done by someone!

Great! And thanks for sharing the source.   Nice code BTW

How about working with 10.4.1 or even (now) 10.5?

I'm going to try it on 10.4.1 and will report back.

Have you had a chance to work with the toolbar provided by LocalGov?

Any comments on how that compares?

Thanks again for this! 

This really belongs on GitHub (I think that has preceded ArcScript hasn't it?)

RebeccaStrauch__GISP
MVP Esteemed Contributor

Hi Paul, glad you are finding it useful.  I haven't tried it on 10.5 or 10.4 yet, but I think it should work.  We skipped over 10.4 but will be testing out 10.5 final this next week, so I'll set then.

i have not rried the LocalGov toolbar, but if you have a link, I'll give it a try.

the reason I didn't put in on GitHub was, even though I feel comfortable pulling from GitHub, I still don't feel comfortable posting.  I'm more likely to put on arcscripts 2.0, ArcGIS Code Sharing   but put it as a blog nice I wasn't sure if it was generic or polished enough for all....but a good starting point.  

if I ever have time, I'd like to expand it with some other similar "utility" type scripts/tools that others have come up with.

PaulDavidson1
Regular Contributor

Hi Rebecca:

The LocalGov toolbar I referred to was the item in your 10/28/15 update: New tool "Set Map Data Sources"

I've looked at that and it seems useful but not as in depth as your tool.

I'm trying to get your toolbar to work in 10.4.1 but so far no luck.

I changed the config.xml to version 10.4, looks like that's the only change to the source that is needed?

I rebuild the addin, and it acts like it is adding the toolbar to my system but then it doesn't show up in my toolbars (via Customize)

I've seen this issue before with other addins and have had to try to add them in as many different ways as I can think of (double clicking the addin file, going to customize, etc...)

So far, no luck though.

Any ideas?  Thanks

RebeccaStrauch__GISP
MVP Esteemed Contributor

You are in ArcCatalog and not ArcMap, correct?  I'll test it in 10.5.0 and see if it loads.  If all else fails, you can bypass the tool bar, find the install/expended folder on your user AppData folder (I'll have to find the correct path)....or even easier, change the .addin extension back to .zip, then just unzip it to a location you know, then in catalog you should be able to find the toolbox.  I think all the tools should work that way. After all, when creating an addin, first you get everything to work with the toolbox.  The addin just adds the easy packaging and installation and the fancy button GUI.  But

re: the LocalGov 10/28/15 update....shows you how much I remember about what I wrote over a year ago. lol.

PaulDavidson1
Regular Contributor

Ahh, slap upside the head... No, I've been in ArcCatalog.

I've been bitten this way more than once, you'd think I'd remember.

There it is....   I just went through this with the newer Xry.  Well, not just, been at least 6 months.

2 weeks is my memory limit (probably more like 2 hours these days.) 

Thanks!

I'll give it a test in 10.4.1 and see how it goes.  I haven't installed 10.5 yet but will be soon.

Too many new Portal things to not jump to 10.5 

I'm working on the Image Server first.  Any day now...

EDIT: seems fine in 10.4.1, thanks!

RebeccaStrauch__GISP
MVP Esteemed Contributor

I think I was unclear....I wanted to make sure you were in ArcCatalog, not ArcMap.  It works on full folders/drives etc, not one...so that shouldn't be the issue.

Understood about jumping into 10.5. Luckily I have a test server and EDN that is letting me play. The "new" Portal concept is all new (really the same as previous version, just a few other options with user levels. so might work better for my dept, but that's another discussion).

Again, try the unzip method if the addin doesn't work.  Then if you run into an error we can look at that.  But the files and toolbox and tools should all be in that file.

...edit....but I guess I should download and try myself. 

RebeccaStrauch__GISP
MVP Esteemed Contributor

I updated my install instructions above, if you haven't noticed already.  Seems like my download must have gone back and forth between the ChkandFixLinks.esriaddin being IN a .zip file, and me just renaming the .addin to zip.  Anyway, hopefully the instructions cover both now.  Thanks for pointing the issue out.  

The tool might be over kill for many users, and may not cover everything 100%, but does cover most situations of broken links.  There are a few data types that I haven't figured out or finished, but it is a good tool especially if you have a lot of users, paths and mxds over the years.

Let me know if you still have issues.getting it to install. on 10.4.

PaulDavidson1
Regular Contributor

Ahh, slap upside the head... No, I've been in ArcMap.

I've been bitten this way more than once, you'd think I'd remember.

But no, there it is....   I just went through this with the newer Xray.  Well, not just, been at least 6 months.

2 weeks is my memory limit (probably more like 2 hours these days.) 

Thanks!

Seems fine in 10.4.1, thanks!  I did modify the config.xml to show 10.4 but I bet that wasn't necessary.

We have hundreds of old mxds floating around.  Some of our Analysts have been here ~25 years.

We came out of the City and they had an Esri user number in the top 10 (well... they still do) so the history of GIS here goes way back.  But we had to get a new Esri cust #.  I lobbied for 7B but...

I'm moving us toward Portal, Water Solutions, etc... and some data structure changes.

So being able to seamlessly move folks' mxds to the new layouts will help maintain goodwill.

Otherwise, we get the normal IT reputation of forcing changes on people, sorry, yes you do have to create that all over!

This tool will help a lot.  I've written simpler scripts that will deal with simple cases and walk folders but I like having the toolbar and training the end user so as they find that folder of 50 old maps that they just have to have, they can do it themselves.  Or at least we can do it for them in a consistent manner.

Many thanks and happy holidays.

RebeccaStrauch__GISP
MVP Esteemed Contributor

We have hundreds of old mxds floating around.  Some of our Analysts have been here ~25 years.

We came out of the City and they had an Esri user number in the top 10 (well... they still do) so the history of GIS here goes way back.  But we had to get a new Esri cust #. 

I've been working with GIS and esri products for my department for 30+ years and have the same (still retained) single-digit CN.  I'm going to hate to give it up when I retire, but alas, it will stay with the department. (sigh)  Hopefully is doesn't get changed for them.   My personal one is 6-digits and it's just not as fun to say.   lol

Seems fine in 10.4.1, thanks!  I did modify the config.xml to show 10.4 but I bet that wasn't necessary.

I agree, really isn't necessary. even for 10.5.

I'm glad you got it working.  The way I approached the process was to run the broken link tool on an entire drive.  Sometimes you need to do this from a few different users/machines point of view.  If you can get a list of all the potention broken links you can sort them and see similarity in datasets, but a variety of paths to get there.  I have been trying to change all my to the FQDN to the data (I think the is the correct initial) so it will be the same for all users. You can have many variations in the broken column that point to the same new destination.  It will also help you find data that users might have on their local machine instead a common network location....if that is needed.

I consider myself a hack when it comes to programming, even though I have written many complex scripts.  So there are always things that can be improved.  So, even though I don't have this in github, if you find something that you tweak to make it work better, let me know.  I know there are things that I want o expand on....just not a high priority in the scheme of things right now.

Happy Holidays to you too.

MichaelVolz
Esteemed Contributor

Nice tool!!

pokateo_
New Contributor III

Awesome tool! Thank you Rebecca!

I couldn't get the tool to work properly without changing the cell text in the .csv from "_review" to "match". This wasn't clear to me in your instructions. Can it be changed to anything but "_review"? Just deleting "_review" didn't work for me.

Also, a couple times I got this error. Why are matches typically not found?

Thanks for putting this together!

EDIT: I am working in ArcGIS Desktop 10.5

RebeccaStrauch__GISP
MVP Esteemed Contributor

I'm glad you are finding the tool valuable.

I couldn't get the tool to work properly without changing the cell text in the .csv from "_review" to "match". This wasn't clear to me in your instructions. Can it be changed to anything but "_review"? Just deleting "_review" didn't work for me.

You need to change the "_review" in the newType column to the same value as the dataType column....or to a new valid type (as I have listed).  

if the column has _review it will skip it...this is a placeholder for you to know that it needs to be looked at further...it may need to be processed manually (I.e. Not 100% yet).  But it will look at the value in the column to see watch the new dataType will be...whether the same type or a different one.  This is in case you need to change, for example, from a shapefile to a fgdb featureclass.  Keep in mind, that if changing the type you will want to make sure the fields or and symbology or GP/selections are still compatible. 

Does that help?

edit:  take a look at the attached sample excel file which might help explain it better....keep in mind I have <> and xxx for IP addresses ....proper paths would be needed of course.

btw, I have tested with 10.2.x thru 10.5.x   And seems to work fine (with the exceptions noted on installing the addin)

pokateo_
New Contributor III

That makes complete sense! Thank you! 

No excel file was attached...

RebeccaStrauch__GISP
MVP Esteemed Contributor

It's attached to the original post (sorry, should have made that more clear).

HurmainAriffin1
New Contributor

Hi Rebecca,

This is very useful tool for me as I need to do inventory for all GIS data (subfolders).

My concern here is the tool cannot listing all the raster type e.g. .tif, jpg,img,etc.except the ESRI GRID.

Please correct me if I am wrong and I really appreciate if we have opportunity to get the updated version.

Many thanks...

RebeccaStrauch__GISP
MVP Esteemed Contributor

Hi Hurmain Ariffin

I'm glad yo find it useful.  However, I don't think I ever finished with all the raster types, since I was mainly interested in seeing what was inside my FGDBs and coverages.  However, it may be that the broken link tools lists those.  I haven't worked on this for a while but hope to get back to see if I can add features (but probably not until later this year, i.e. Fall/Winter 2018). I will post if/when I get a new share-able version.  Feel free to unzip the addin and modify the .py files.  

I hadn't run the version installed on my machine for a while so I am doing that now.  If I find that id does those raster files, I'll send you a copy (I received your direct email).  Our IT wiped out my all my current development a few months ago and I'm still trying to rebuild the priority scripts from my deployed addins.  Looking on the bright side, if there is one, at least I had those.   Lesson #1 for me....back everything up to a safe location that they won't delete.

deleted-user-Kh-VU5qzFIGJ
New Contributor II

Hi Rebecca,

Do you know where exactly this tool downloads too?  I need to uninstall it and am having a difficult time doing so.  Thanks!

EllenSaxon
New Contributor III

How have I only just found this?! I have been wanting this for years!!! 

Thank you!

RebeccaStrauch__GISP
MVP Esteemed Contributor

Glad to know it still works and is still helpful.  

About the Author
Worked with GIS for 30+ years for the Alaska Dept of Fish and Game.