Comparing a Search Cursor list to number of PDFs Created.

MichelleCouden1 · ‎03-01-2013

I need some help on my program. I have the parameters set to ask user to pick an mxd and the folder where those data driven PDFs are located. I started with Search Cursor to create a list from the Sheet_ID field of an mxd. To get total number of pages of that mxd. Now I need to write in for the program to compare the list created from the Sheet numbers to the number of PDFs created and tell me if there is a difference in the total count. For Example: If there are 1 out of 11 sheets for the mxd, then there should be 11 PDFs created and so on. MY code is below:

#Purpose: Compares SheeID field to the toal PDFs created from Data Driven Pages Program

Import arcpy, os, string, sys

#Make parameters for people to choose mxd and folder of PDFs to compare
mxdList = string.split(arcpy.GetParameterAsText(0), ";")
dir = arcpy.GetParameterAsText(1)

#Use Search Cursor to go through Attribuite Table to get Sheet number info
rows = arcpy.SearchCursor(mxdList, "Sheet_ID")

#Use list from Sheet_ID field to match the number of PDFs created to get difference

MichelleCouden1 · ‎03-05-2013

You know setting the parameters to multiple mxds and then checking for layers that the grid name that would probably be better. Because all the names for the grids are named as such: ABL_Base_Grid.shp, AMA_Base_MBGrid.shp, AUS_AnnualMBGrid.shp. The only thing that changes is the abbreviation of the district it is in the State of Texas. So it could search for **grid and then read the Sheet_Id field.

Anonymous User · ‎03-05-2013

I think it would be easier if you could do it on only one mxd. Then you could just search for all those layers one at a time inside the mxd.

MichelleCouden1 · ‎03-06-2013

Yep, that is what I think. Please, take a look at the code to see if I changed stuff correctly. I figured out to use ListLayers. See how close I am.

import arcpy, os, sys
from os import path as p
from arcpy import mapping as m

#Make parameters for people to choose mxd and folder of PDFs to compare

mxdList = string.split(arcpy.GetParameterAsText(0), ";")
pdf_path = arcpy.GetParameterAsText(1)

report = {}

#Use Search Cursor to go through Attribuite Table to get Sheet number info

for mapDoc in mxdList:
    arcpy.AddMessage(mapDoc)
    mxd = arcpy.mapping.MapDocument(mapDoc)
    df = arcpy.mapping.ListLayers(mxd, "*,Traff Proj")[0]
    for lyr in m.ListLayers(mxd, "*, Grid"):
        if lyr.description == "*, Grid":
            max_list = []
            rows = arcpy.SearchCursor(lyr)
            for row in rows:
                max_list.append(row.Sheet_ID)
                sheet_count = max(max_list)
                arcpy.AddMessage('Sheet count: %s'%sheet_count)
                print 'Sheet count: %s'%sheet_count
            else:
                arcpy.AddError('No Layers in %s match data source'%mapDoc)

    page_list = []

Anonymous User · ‎03-06-2013

Yep, that is what I think. Please, take a look at the code to see if I changed stuff correctly. I figured out to use ListLayers. See how close I am.

import arcpy, os, sys
from os import path as p
from arcpy import mapping as m

#Make parameters for people to choose mxd and folder of PDFs to compare

mxdList = string.split(arcpy.GetParameterAsText(0), ";")
pdf_path = arcpy.GetParameterAsText(1)

report = {}

#Use Search Cursor to go through Attribuite Table to get Sheet number info

for mapDoc in mxdList:
    arcpy.AddMessage(mapDoc)
    mxd = arcpy.mapping.MapDocument(mapDoc)
    df = arcpy.mapping.ListLayers(mxd, "*,Traff Proj")[0]
    for lyr in m.ListLayers(mxd, "*, Grid"):
        if lyr.description == "*, Grid":
            max_list = []
            rows = arcpy.SearchCursor(lyr)
            for row in rows:
                max_list.append(row.Sheet_ID)
                sheet_count = max(max_list)
                arcpy.AddMessage('Sheet count: %s'%sheet_count)
                print 'Sheet count: %s'%sheet_count
            else:
                arcpy.AddError('No Layers in %s match data source'%mapDoc)
                
    page_list = []

It looks like you are a little closer, I have highlighted problems I see in red. For the future, when you post your code you should use the code tags so it will preserve your indentation which is important in Python (the hashtag symbol, or start with code inside [ ] and end with /code inside [ ] ). On this line:

df = arcpy.mapping.ListLayers(mxd, "*,Traff Proj")[0]

This may not be a problem in your case, but when you use the list layers with [0] afterwards, this will only grab the first value in the list. I see that you are not using this "df" variable, what is it's purpose? In most code samples from the help docs, df will typically denote the data frame. Was this supposed to be ListDataFrames instead of ListLayers?

At the end you are creating an empty list called page_list, but doing nothing with it. I would add something like this at the end:

page_list = []
arcpy.env.workspace = ws = pdf_path
for pdf_doc in arcpy.ListFiles('*.pdf'):
    page_list.append(int(''.join([i for i in pdf_doc.split('.')[0] if i.isdigit()])))
page_count = max(page_list)
arcpy.AddMessage('PDF page count: %s' page_count)
if page_count == sheet_count:
    arcpy.AddMessage('%s Sheet Count matches number of pages in %s' %(mapDoc,pdf_doc))
else:
    arcpy.AddMessage('%s Sheet Count does not match number of pages in %s' %(mapDoc,pdf_doc))

In my example I used the report = {} to create an empty dictionary to store the values that did not match up to write out to a text file. If you do not want to include this part in your script tool you could take this out. I think you're almost there now as long as your wild card usage is correct.

MichelleCouden1 · ‎03-06-2013

The only reason I used df for data Frame was because of the sample I got from ListLayers on the resource desk. I would prefer not to use df because in this case it would be easier for it to just search for the layer. Am I right on the query for name of the layer. For Example: I put an asterik in front of the name to let it know that Grid is not the complete name but it does have Grid in the name.
Before I finish completely, I just want to give you huge Thanks for helping me on this project. You are the first person that has shown me patience. I truly truly appreciate that. This could never been done without your help. This is by far the longest Python code I've ever written.

Anonymous User · ‎03-06-2013

No problem, I am no expert but do not mind helping when I can. I think you could take the df variable out, it is an optional parameter for the list layers so may not be necessary for your case. One thing that I usually do when testing a script tool is test it as a stand alone script, then once it works I will convert it to script tool. Of course, since you have a multi value parameter you may have to modify some things slightly for it to work as a script tool.

Have you added the last part from my last post yet? Go ahead and give that a try. If you are still getting errors let me know what they say. You may also want to add some more print statements to let you know what is going on in the script.

MichelleCouden1 · ‎03-06-2013

It doesn't like my code for sheet count. I've attached to error again for you. It finishes counting the PDFs but it is puzzled on how to give me a sheet count. Error is Sheet_count not defined. So wouldn't I need to write, sheet_count = arcpy.mapping.ListLayers.

Anonymous User · ‎03-06-2013

It doesn't like my code for sheet count. I've attached to error again for you. It finishes counting the PDFs but it is puzzled on how to give me a sheet count. Error is Sheet_count not defined. So wouldn't I need to write, sheet_count = arcpy.mapping.ListLayers.

Not sure if this is what was causing a problem, but it looks like the indentation was off judging by your prior example. I did not catch that the first time. Another thing that may be throwing an error is the the "sheet_count" variable has no max value because an empty list was created in the first place from the search cursor. This could be due to it not picking up any layers from your ListLayers().

import arcpy, os, sys
from os import path as p
from arcpy import mapping as m

#Make parameters for people to choose mxd and folder of PDFs to compare

mxdList = string.split(arcpy.GetParameterAsText(0), ";")
pdf_path = arcpy.GetParameterAsText(1)

report = {}

#Use Search Cursor to go through Attribuite Table to get Sheet number info

for mapDoc in mxdList:
    arcpy.AddMessage(mapDoc)
    mxd = arcpy.mapping.MapDocument(mapDoc)
    df = arcpy.mapping.ListLayers(mxd, "*,Traff Proj")[0]
    for lyr in m.ListLayers(mxd, "*, Grid"):
        if lyr.description == "*, Grid":
            max_list = []
            rows = arcpy.SearchCursor(lyr)
            for row in rows:
                max_list.append(row.Sheet_ID)

        else:
            arcpy.AddError('No Layers in %s match data source'%mapDoc)

        sheet_count = max(max_list)
        arcpy.AddMessage('Sheet count: %s'%sheet_count)
        print 'Sheet count: %s'%sheet_count
                
        page_list = []
        arcpy.env.workspace = ws = pdf_path
        for pdf_doc in arcpy.ListFiles('*.pdf'):
            page_list.append(int(''.join([i for i in pdf_doc.split('.')[0] if i.isdigit()])))
        page_count = max(page_list)
        arcpy.AddMessage('PDF page count: %s' page_count)
        if page_count == sheet_count:
            arcpy.AddMessage('%s Sheet Count matches number of pages in %s' %(mapDoc,pdf_doc))
        else:
            arcpy.AddMessage('%s Sheet Count does not match number of pages in %s' %(mapDoc,pdf_doc))

I am not sure about the PDF situation though now. I cannot really see the PDF's you are working with, nor the data and mxd's. Writing this tool may be more trouble than it is worth. If the whole reason you are doing this check in the first place is because you have to have a "single file PDF map book", ie all the DDP pages in one PDF and you have to have another situation where all the pages are in separate PDF's, I think it would be much easier to just write a script to export all DDP pages into a single file PDF and then split that into single pages. You could simply set up an iterator on each PDF and create a one page PDF for each page in the master map book. In doing that you could be sure that no matter what, you will have a page for each Sheet_ID.

MichelleCouden1 · ‎03-06-2013

I figured out the name not defined. Needed to make sheet_count = 0. It is giving me the correct PDF count in the folder. But it's not counting the sheets right. The one statement I got was "Sheet count does not match number of pages in Amarillo_Base_Map9.pdf" There are no pdfs inside the mxd. It should be getting the sheet count from the Sheet_ID field inside the mxd. The error code should be telling me "Sheet Count in mxd does not match PDF count created" Code is below:

import arcpy, os, sys, string
from os import path as p
from arcpy import mapping as m

#Make parameters for people to choose mxd and folder of PDFs to compare

mxdList = string.split(arcpy.GetParameterAsText(0), ";")
pdf_path = arcpy.GetParameterAsText(1)

#Use Search Cursor to go through Attribuite Table to get Sheet number info

sheet_count = 0
for mapDoc in mxdList:
    arcpy.AddMessage(mapDoc)
    mxd = arcpy.mapping.MapDocument(mapDoc)
    for lyr in m.ListLayers(mxd, "*, Grid"):
        if lyr.description == "*, Grid":
            max_list = []
            rows = arcpy.SearchCursor(lyr)
            for row in rows:
                max_list.append(row.Sheet_ID)
            sheet_count = max(max_list)
            arcpy.AddMessage('Sheet count: %s'%sheet_count)
            print 'Sheet count: %s'%sheet_count
        else:
            arcpy.AddError('No Layers in %s match data source'%mapDoc)
                
    page_list = []
    
    # List all PDF's in the pdf_path folder
    
    arcpy.env.workspace = ws = pdf_path
    for pdf_doc in arcpy.ListFiles('*.pdf'):
        page_list.append(int(''.join([i for i in pdf_doc.split('.')[0] if i.isdigit()])))
    page_count = max(page_list)
    arcpy.AddMessage('PDF Page Count: %s' %page_count)
    if page_count == sheet_count:
        arcpy.AddMessage('%s Sheet Count matches number of pages in %s' %(mapDoc,pdf_doc))
    else:
        arcpy.AddMessage('%s Sheet Count does not match number of pages in %s' %(mapDoc,pdf_doc))

MichelleCouden1 · ‎03-06-2013

I am actually doing process of elimination. It is the first part of the code that is not doing what we want it to do. I saved just the top part as another script. And I'm having it work the mxd and count the sheets and print the total. So far, its not doing either. It works, just doesn't print out a total number of sheets from that Grid layer.