Add Path to JPGs in Attribute Table based on Unique ID

462
11
Jump to solution
03-19-2019 08:10 AM
JaredPilbeam2
MVP Regular Contributor

I've been tasked with updating attribute table fields with file paths to image files in a folder based on a unique ID. I'm having trouble thinking up a script that will do this? I'm working on a standalone script in ArcGIS Pro 2.3.1 Arcpy. 

This is an example of how the table should look. Every JPG has a unique ID that matches the Post_Number field unique ID. Currently, the paths to the JPGs are entered manually into the table, and there are 1000s of them.

All I've done so far is create a cursor that searches the Post_Number and Photo fields. 

import arcpy, os

fc = r'pathto\Projects\DOT_SignInventory\Data.gdb\WCDOT_Signs'
fields = ['Post_Number', 'Photo']

    with arcpy.da.SearchCursor(fc, fields) as cursor:
        for row in cursor:
            print(u"{0}, {1}".format(row[0], row[1]))
0 Kudos
1 Solution

Accepted Solutions
RandyBurton
MVP Regular Contributor

I was testing something like:

import arcpy
import os

startDir = r'C:\path\to\jpegs\root'
jpgs = {} # empty dictionary

for root, dirs, files in os.walk(startDir):
    for file in files:
        if file.endswith(".jpg"): # for key, trim '.jpg'
            jpgs[file[:-4]] = os.path.join(root,file)

# print jpgs

fc = r'C:\path\to\file.gdb\photos'
fields = ['OID@', 'Post_Number', 'Photo']

with arcpy.da.UpdateCursor(fc, fields) as cursor:
    for row in cursor:
        try:
            row[2] = jpgs[row[1]]
            cursor.updateRow(row)
        except KeyError:
            print "No photo for: {}".format(row[1])
    
del cursor‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

This assumes the 'Post_Number' is text, and that the 'Photo' field length is able to take the complete path.  I was testing with 2.7, so some adjustment may be needed for 3.x.  I'm also not sure at what size a dictionary might become too big.

View solution in original post

11 Replies
MitchHolley1
MVP Regular Contributor

All all JPGs stored in the same directory?  If so, you can use string formatting and an update cursor to create a complete path.  Also, a Calculate Field could work as well.

Test this on scratch data first (create a backup).

import arcpy, os

fc = r'pathto\Projects\DOT_SignInventory\Data.gdb\WCDOT_Signs'
jpg_dir = r"P:\GIS_Highway_Dept\Applications\Sign Inventory\Photos\CH01"
fields = ['Post_Number', 'Photo']

with arcpy.da.UpdateCursor(fc, fields) as cursor:
    for row in cursor:
        path_to_jpg = """{0}\{1}.jpg""".format(jpg_dir, row[0])
        cursor.updateRow(row)
del cursor‍‍‍‍‍‍‍‍‍‍‍
JaredPilbeam2
MVP Regular Contributor

Mitch,

Thanks for the reply. There are multiple folders, actually. The folders have JPGs of signs of a certain highway. And there are as many folders containing these JPGs as there are county highways (CH01, for example).

EDIT: I tried the code you provided on a copy, but it didn't seem to get updated. I put a print statement in after updateRow() and the directory is actually being called because it printed out all the JPGs.

cursor.updateRow(row)
print(path_to_jpg)‍‍
0 Kudos
curtvprice
MVP Esteemed Contributor

I tried the code you provided on a copy, but it didn't seem to get updated.

Jared, the code has a bug, Mitch forgot update the cursor with the updated data. Here's a fix

with arcpy.da.UpdateCursor(fc, fields) as cursor:
    for row in cursor:
        path_to_jpg = """{0}\{1}.jpg""".format(jpg_dir, row[0])
        row[1] = path_to_jpg # <-- added this line
        cursor.updateRow(row)
del cursor‍‍‍‍‍‍
JaredPilbeam2
MVP Regular Contributor

Curtis,

Thanks, that almost got me there. But, for the jpg_dir variable, however this string looks is how it's updated in the table. For example,

if it's like this in the variable: 

jpg_dir = r'P:\GIS_Highway_Dept\Applications\Sign Inventory\Photos\CH01'

then it's updated like this in the table:

P:\GIS_Highway_Dept\Applications\Sign Inventory\Photos\CH01\2410765.jpg‍‍‍

The problem is 2410765.jpg is County Highway 24, as the first two digits in the ID is the highway number. So, that jpg isn't linked correctly, and neither will any be unless they begin with 01 (in the CH01 folder).

Do I have to iterate the Photos directory first and create a list or library, as Randy pointed out?

0 Kudos
curtvprice
MVP Esteemed Contributor

If you don't have the route number in the table already, yes, I think you'll need to build a dictionary off the existing jpg paths to get that information.  os.walk or arcpy.da.walk should work for this.

0 Kudos
RandyBurton
MVP Regular Contributor

Is the 7 digit number always unique?  That is, it does not depend on a directory such as "CH01" to make it unique?  If so, I would probably create a script that walks through all your jpeg directory and inserts the filename as a key and the path as a value into a dictionary.  Then you could use an update cursor to read the field containing the jpeg file name (extracting the file name if necessary) and replace the path with the dictionary value.

0 Kudos
JaredPilbeam2
MVP Regular Contributor

Randy,

That's correct. The first two digits of the 7 digit unique ID are based on the highway number and it's respective directory. For example:

P:\GIS_Highway_Dept\Applications\Sign Inventory\Photos\CH01\0110930.jpg

P:\GIS_Highway_Dept\Applications\Sign Inventory\Photos\CH19\1910930.jpg

P:\GIS_Highway_Dept\Applications\Sign Inventory\Photos\CH84\8410930.jpg

Etc... 

A dictionary is something I have to get to know how to utilize better. So, thanks, that gives me something to chew on.

0 Kudos
RandyBurton
MVP Regular Contributor

I was testing something like:

import arcpy
import os

startDir = r'C:\path\to\jpegs\root'
jpgs = {} # empty dictionary

for root, dirs, files in os.walk(startDir):
    for file in files:
        if file.endswith(".jpg"): # for key, trim '.jpg'
            jpgs[file[:-4]] = os.path.join(root,file)

# print jpgs

fc = r'C:\path\to\file.gdb\photos'
fields = ['OID@', 'Post_Number', 'Photo']

with arcpy.da.UpdateCursor(fc, fields) as cursor:
    for row in cursor:
        try:
            row[2] = jpgs[row[1]]
            cursor.updateRow(row)
        except KeyError:
            print "No photo for: {}".format(row[1])
    
del cursor‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

This assumes the 'Post_Number' is text, and that the 'Photo' field length is able to take the complete path.  I was testing with 2.7, so some adjustment may be needed for 3.x.  I'm also not sure at what size a dictionary might become too big.

JaredPilbeam2
MVP Regular Contributor

Randy,

Thanks for that. To answer your assumptions, yes, the 'Post_Number' field is text and the 'Photo' field length is 150--so, plenty long.

If I try your code word for word it runs with no error, but in the 'Photo' field it doesn't give the path. Here's an example of what it puts there:

Thumbs.db\0110030.jpg

Not sure why it does this? There's no 'Thumbs' file in the directory. 

0 Kudos