johnmdye

Extract Attachments to Disk

Blog Post created by johnmdye on Oct 6, 2016

I came across the need to save all of the attachments for a feature class to a folder on my system today and thought I'd share this solution I put together. Thanks to Andrew Chapkowski for the real meat of this solution.

from arcpy import da
import os

# Specify the directory you want to save your attachments to
SaveFolder = r"C:\Users\johnmdye\Documents\ArcGIS\MyGDB_Attachments"
# Reference the Feature Class
FC = C:\Users\johnmdye\Documents\ArcGIS\MyGDB.gdb\Points_of_Interest
# Reference the Attachment Table for that Feature Class
AttachmentTable = C:\Users\johnmdye\Documents\ArcGIS\MyGDB.gdb\PointsOfInterest_ATTACH

# Setup a counter
a_count = 0

# Set up a SearchCursor on your Attribute Table to iterate on the
# 'rel_globalid', 'data' and 'att_name' attributes of your attachments
# which respectively contain the globalid of the feature related to that attachment,
# the data in bytes of that attachment and the name of the file as it was saved on
# the system which uploaded the attachment
with da.SearchCursor(AttachmentTable, ["rel_globalid", "data", "att_name"]) as Acursor:
     # For each attachment
    for a in Acursor:
          # get the 'rel_globalid' as 'a_id'
        a_id = a[0]
          # get the 'data' as 'bytestream'
        bytestream = a[1]
          # get the 'att_name' as fname
        fname= a[2]
          # Parse the 'fname' value to determine the file-extension of the attachment
          # This is important because your attachment data is encoded to an extension,
          # so if you try write the attachment data to a file with the wrong extension
          # (such as writing a .jpg attachment as .png or .pdf file instead of .jpg),
          # your resulting file will not work because it won't adhere to the conventions
          # of that datatype. Essentially, all your bits and bytes are in the wrong place
          # for that kind of file.
        ext = "." + fname.split('.',1)[1]
          # Set up a SearchCursor on your Feature Class which the attachments are related
          # to and iterate on the 'globalid' and 'last_edit' fields. Use of the 'last_edit'
          # field will only work if you had Editor Tracking enabled on the dataset.
          # If your data has a unique name field for all records, you could add that to
          # the search cursor and with some slight modification, write some files that have
          # that feature's unique name as the filename. I'm using the 'last_edit' field
          # because its a date-time stamp that in the vast majority of cases, would be
          # unique for each feature.
        with da.SearchCursor(FeatureClass, ['globalid', 'revised']) as Fcursor:
               # for each feature
            for f in Fcursor:
                    # get the 'globalid' as 'f_id'
                f_id = f[0]
                    # get the 'last_edit' date as 'f_dt'
                f_dt = f[1]
               
                    # if the current attachment's 'rel_globalid' equals the feature's
                    # 'globalid'
                if a_id == f_id:
                         # Yay! We found an attachment. Increment the counter by 1
                    a_count = a_count+1
                         # convert my 'last_edit' value from the feature to a python
                         # datetime object
                    dtObject = datetime.datetime.strptime(str(f_dt), '%Y-%m-%d %H:%M:%S')
                         # convert my datetime object to a string representing that date
                    dtString = dtObject.strftime('%B.%d.%Y-%I.%M.%S')
                         try:
                              # Write my attachment to disk
                              open(os.path.join(SaveFolder,
                                                fname[:-4] + "_" + dtString + ext),
                                                    'wb').write(bytestream.tobytes())
                         # If something bad happened while trying to write the attachment
                         # to disk
                         except Exception as e:
                              # Tell me why I'm a bad person
                              raise Exception(e)
print "All Done. Extracted " + str(a_count) + " attachments."

Outcomes