Select to view content in your preferred language

Automate adding Filepath to attributes in table

767
3
Jump to solution
06-29-2022 07:46 AM
Caitlin
Occasional Contributor

Hi all,

I'm working to create a python script in Arcpro notebooks to add the filepath of a PDF document associated with each attribute in a table to a new field in that same table.

The issue is that the PDF documents are in different folders (layout shown below):

MAINFOLDER

      A

            STREETNAME1

                   ADDRESS1.pdf

                   ADDRESS2.pdf

            STREETNAME2

            STREETNAME3

      B

            STREETNAME1

                   ADDRESS1.pdf

                   ADDRESS2.pdf

            STREETNAME2

            STREETNAME3

      C

...

 

If all of the PDF documents had the same path, I would just use an updatecursor to add in the path + ADDRESSField. Since the path is different, I'm not sure how to add the correct path to each attribute.

 

Currently, I've used the glob module to add all of the PDF filepaths to a list, and I have a list of lists that contains the attribute fields for both STREET_NUM and STREETNAME that I'm using to associate each attribute with the correct table.

Code:

import glob
import os

os.chdir('M:\Service Sheets')

servicePath = glob.glob("*/*/*.pdf") # Here is the list of all PDF files in the directory
fc = "Parcel_Subset"
fields = ["STREET_NUM", "STREETNAME", "PDFLink"] # PDFLink is where the output will go
LocList = []


with arcpy.da.SearchCursor(fc, ['STREET_NUM', 'STREETNAME']) as cursor:
    for row in cursor:
        LocList.append(row) # Creates the List of Lists

# Here would be the updatecursor, if I could get it to work

 

I'm not sure this is the best method, or if there's an easier way that I just don't know. This would also just add a text field with the path, if possible, it would be better to make the path a clickable link that opened the PDF document.

 

Any advice or solutions would be appreciated!

Tags (3)
0 Kudos
1 Solution

Accepted Solutions
by Anonymous User
Not applicable

I'm not sure how you are tying the street number and streetname to the file names in the folders, but as an idea to your first part of the question, you could use a dictionary to store all of the files and their paths. I can't really tell what attributes ties to what folder/ file, so this is only an idea that you can expand on if it looks doable with your data.

 

def recursive_walk(parentFolder):
    pdfDict = {}

    for folderName, subfolders, filenames in os.walk(parentFolder):
        if subfolders:
            for subfolder in subfolders:
                recursive_walk(subfolder)

        for filename in filenames:
            print(os.path.join(folderName, filename))
            pths = folderName.split(os.sep)
            pdfDict[pths[-2]+pths[-1]+filename.replace('.pdf', '')] = { 'attrMatch': pths[-1] +filename.replace('.pdf', ''),  filename.replace('.pdf', '') : os.path.join(folderName, filename) }


    return pdfDict

pdfDict = recursive_walk(r'C:\Users\Documents\PDFs')

 

 

provides a dictionary like:

{'AStreetname1address1': {'attrMatch': 'Streetname1address1', 'address1': 'C:\\Users\\Documents\\PDFs\\A\\Streetname1\\address1.pdf'},
'AStreetname1address2': {'attrMatch': 'Streetname1address2', 'address2': 'C:\\Users\\Documents\\PDFs\\A\\Streetname1\\address2.pdf'},
'AStreetname2address1': {'attrMatch': 'Streetname2address1', 'address1': 'C:\\Users\\Documents\\PDFs\\A\\Streetname2\\address1.pdf'},
...
'CStreetname3address1': {'attrMatch': 'Streetname3address1', 'address1': 'C:\\Users\\Documents\\PDFs\\C\\Streetname3\\address1.pdf'},
'CStreetname3address2': {'attrMatch': 'Streetname3address2', 'address2': 'C:\\Users\\Documents\\PDFs\\C\\Streetname3\\address2.pdf'}}

 

 

 

Then in the cursor, get the path by looking up the combo of attributes:

 

with arcpy.da.UpdateCursor(fc, ['STREET_NUM', 'STREETNAME', 'PDFLink']) as cursor:
    for row in cursor:
        attrConcat = f"{row[1]}{row[0]}" # create the unique combination to match the key format in the dictionary
        for k, v, in pdfDict.items(): # test if the key combo has a dictionary entry
            if v['attrMatch'] == attrConcat: # get the values from the matching key
                attr, path = v.items()
                row[2] = fr'''<a href="{path[1]} target=_top">{attr[1]}</a>''' # set the PDFLink to the path from the dict
                cursor.UpdateRow(row)

 

 

View solution in original post

3 Replies
AlfredBaldenweck
MVP Regular Contributor

In regards to the second part of the question, you can format the output to be in HTML, which will be read as a link.

<a href="C:\Users\USER\...\Lorem Ipsum.docx" target=_top">Lorem Ipsum.docx</a>

 

0 Kudos
by Anonymous User
Not applicable

I'm not sure how you are tying the street number and streetname to the file names in the folders, but as an idea to your first part of the question, you could use a dictionary to store all of the files and their paths. I can't really tell what attributes ties to what folder/ file, so this is only an idea that you can expand on if it looks doable with your data.

 

def recursive_walk(parentFolder):
    pdfDict = {}

    for folderName, subfolders, filenames in os.walk(parentFolder):
        if subfolders:
            for subfolder in subfolders:
                recursive_walk(subfolder)

        for filename in filenames:
            print(os.path.join(folderName, filename))
            pths = folderName.split(os.sep)
            pdfDict[pths[-2]+pths[-1]+filename.replace('.pdf', '')] = { 'attrMatch': pths[-1] +filename.replace('.pdf', ''),  filename.replace('.pdf', '') : os.path.join(folderName, filename) }


    return pdfDict

pdfDict = recursive_walk(r'C:\Users\Documents\PDFs')

 

 

provides a dictionary like:

{'AStreetname1address1': {'attrMatch': 'Streetname1address1', 'address1': 'C:\\Users\\Documents\\PDFs\\A\\Streetname1\\address1.pdf'},
'AStreetname1address2': {'attrMatch': 'Streetname1address2', 'address2': 'C:\\Users\\Documents\\PDFs\\A\\Streetname1\\address2.pdf'},
'AStreetname2address1': {'attrMatch': 'Streetname2address1', 'address1': 'C:\\Users\\Documents\\PDFs\\A\\Streetname2\\address1.pdf'},
...
'CStreetname3address1': {'attrMatch': 'Streetname3address1', 'address1': 'C:\\Users\\Documents\\PDFs\\C\\Streetname3\\address1.pdf'},
'CStreetname3address2': {'attrMatch': 'Streetname3address2', 'address2': 'C:\\Users\\Documents\\PDFs\\C\\Streetname3\\address2.pdf'}}

 

 

 

Then in the cursor, get the path by looking up the combo of attributes:

 

with arcpy.da.UpdateCursor(fc, ['STREET_NUM', 'STREETNAME', 'PDFLink']) as cursor:
    for row in cursor:
        attrConcat = f"{row[1]}{row[0]}" # create the unique combination to match the key format in the dictionary
        for k, v, in pdfDict.items(): # test if the key combo has a dictionary entry
            if v['attrMatch'] == attrConcat: # get the values from the matching key
                attr, path = v.items()
                row[2] = fr'''<a href="{path[1]} target=_top">{attr[1]}</a>''' # set the PDFLink to the path from the dict
                cursor.UpdateRow(row)

 

 

Caitlin
Occasional Contributor

Thank you for the help! I had to make some edits to make this work with my data, but this was pretty much exactly what I was looking for! 

0 Kudos