Comparing a Search Cursor list to number of PDFs Created.

MichelleCouden1 · ‎03-01-2013

I need some help on my program. I have the parameters set to ask user to pick an mxd and the folder where those data driven PDFs are located. I started with Search Cursor to create a list from the Sheet_ID field of an mxd. To get total number of pages of that mxd. Now I need to write in for the program to compare the list created from the Sheet numbers to the number of PDFs created and tell me if there is a difference in the total count. For Example: If there are 1 out of 11 sheets for the mxd, then there should be 11 PDFs created and so on. MY code is below:

#Purpose: Compares SheeID field to the toal PDFs created from Data Driven Pages Program

Import arcpy, os, string, sys

#Make parameters for people to choose mxd and folder of PDFs to compare
mxdList = string.split(arcpy.GetParameterAsText(0), ";")
dir = arcpy.GetParameterAsText(1)

#Use Search Cursor to go through Attribuite Table to get Sheet number info
rows = arcpy.SearchCursor(mxdList, "Sheet_ID")

#Use list from Sheet_ID field to match the number of PDFs created to get difference

MichelleCouden1 · ‎03-11-2013

Answered!!

View solution in original post

Anonymous User · ‎03-04-2013

I think you are on the right track here, but you are not using the search cursor correctly. A search cursor can only be ran on a feature class. You are using a list of mxd's. Here is a way to check a through the attribute table of your feature class that contains the Sheet_ID field. I set up another parameter called fc_path where you can have the user point to the feature class that will have the Sheet_ID field. It will then search through each mxd from the input list for a layer that has a data source that matches the fc (in case the layers have different names in the mxd).

#Purpose: Compares SheeID field to the toal PDFs created from Data Driven Pages Program

import arcpy, os, sys

#Make parameters for people to choose mxd and folder of PDFs to compare
mxdList = string.split(arcpy.GetParameterAsText(0), ";")
dir = arcpy.GetParameterAsText(1)
fc_path = arcpy.GetParameterAsText(2)

for mapDoc in mxdList:
    arcpy.mapping.MapDocument(mapDoc)
    for lyr in m.ListLayers(mxd):
        if lyr.supports('DATASOURCE'):
            if lyr.dataSource == fc_path:
                max_list = []
                rows = arcpy.SearchCursor(fc_path, "Sheet_ID")
                for row in rows:
                    max_list.append(row.Sheet_ID)
                pdf_count = max(max_list)
                arcpy.AddMessage('Sheet count:  %s'%pdf_count)
            else:
                arcpy.AddError('No layers in %s match data source'%mapDoc)

I am a little confused about comparing the number of pages from the Sheet_ID field from a feature class. Are you going to have the user select PDF's manually to check? If this is the case you will have to find a way to match up the mxd's to the pdf's to make sure the sheet count is corresponding to the correct pdf. This would be easy if the pdf's are named after the mxd's. You will also need to read the page count from each pdf which can be done like this:

pdfDoc = arcpy.mapping.PDFDocumentOpen(path)
print pdfDoc.pageCount

MichelleCouden1 · ‎03-04-2013

I'm getting an "Object: Error in getting parameter as text" error. Could it be I need to drop the dir = because of the fc_path =? I did add the pdfDoc because you guessed right. I am having the user pick the mxd and the folder with all the pdfs in it. For Example the mxd is named Abilene_Base_Map and the pdf is named Abilene_Base_Map1 ,Abilene_Base_Map2 and so on.
Please check my code to make sure I got it right. Thanks!!

import arcpy, os, string, sys

#Make parameters for people to choose mxd and folder of PDFs to compare
mxdList = string.split(arcpy.GetParameterAsText(0), ";")
dir = arcpy.GetParameterAsText(1)
fc_path = arcpy.GetParameterAsText(2)
pdfDoc = arcpy.mapping.PDFDocumentOpen(path)

#Use Search Cursor to go through Attribuite Table to get Sheet number info
for mapDoc in mxdList:
    arcpy.mapping.MapDocument(mapDoc)
    for lyr in m.ListLayers(mxd):
        if lyr.supports('DATASOURCE'):
            if lyr.dataSource == fc_path:
                max_list = []
                rows = arcpy.SearchCursor (mxdList, "Sheet_ID")
                for row in rows:
                    max_list.append(row.Sheet_ID)
                pdf_count = max(max_list)
                arcpy.AddMessage('Sheet count: %s'%pdf_count)
            else:
                arcpy.AddError('No Layers in %s match data source'%mapDoc)
print pdfDoc.pageCount

Anonymous User · ‎03-04-2013

I am a little confused about your PDF files. You say they are named Abilene_Base_Map1 , Abilene_Base_Map2 etc. Are these separate map books or are they individual 1 page pdf's that will be combined into one map book?

MichelleCouden1 · ‎03-04-2013

The pdfs are created from the one mxd. For example: Abilene has 11 sheets (data driven pages) so when that tool makes the PDFs it makes 11 PDFS. One for each sheet or Data driven page. I am having a little trouble with your code that you sen earlier. It is saying m. is not defined. You have it typed for lyr in m.ListLayers(mxd): Can you tell me what m. is so I can fix it?

import arcpy, os, string, sys

#Make parameters for people to choose mxd and folder of PDFs to compare
mxdList = string.split(arcpy.GetParameterAsText(0), ";")
dir = arcpy.GetParameterAsText(1)
fc_path = arcpy.GetParameterAsText(2)

#Use Search Cursor to go through Attribuite Table to get Sheet number info
for mapDoc in mxdList:
    arcpy.mapping.MapDocument(mapDoc)
    for lyr in mxdList.ListLayers(mxd):
        if lyr.supports('DATASOURCE'):
            if lyr.dataSource == fc_path:
                max_list = []
                rows = arcpy.SearchCursor (mxdList, "Sheet_ID")
                for row in rows:
                    max_list.append(row.Sheet_ID)
                pdf_count = max(max_list)
                arcpy.AddMessage('Sheet count: %s'%pdf_count)
            else:
                arcpy.AddError('No Layers in %s match data source'%mapDoc)
pdfDoc = arcpy.mapping.PDFDocumentOpen(path)
print pdfDoc.pageCount

Anonymous User · ‎03-04-2013

The pdfs are created from the one mxd. For example: Abilene has 11 sheets (data driven pages) so when that tool makes the PDFs it makes 11 PDFS. One for each sheet or Data driven page.

Using Data Driven Pages should be creating only one PDF that is 11 pages for that MXD. You need to set the 'multiple_files' to the 'SINGLE_FILE' option. That way you get all the sheets into one pdf.

http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//00s300000030000000

As for the m, I forgot to add the import statement. As a shortcut I always import arcpy.mapping as m so I only have to type m instead of the full arcpy.mapping namespace. You should try re-exporting your mxd's using the multiple page option so that you will have only one pdf with all the sheets, then you can do a page count on each pdf. I added a part at the end to make a text file inside the pdf folder that will report any that do not match up. But as I said, I would recommend re-exporting your DDP into multi page documents.

#Purpose: Compares SheeID field to the toal PDFs created from Data Driven Pages Program

import arcpy, os, sys
from os import path as p
from arcpy import mapping as m

#Make parameters for people to choose mxd and folder of PDFs to compare
mxdList = arcpy.GetParameterAsText(0).split(';')
pdf_path = arcpy.GetParameterAsText(1)
fc_path = arcpy.GetParameterAsText(2)

report = {}
for mapDoc in mxdList:
    arcpy.mapping.MapDocument(mapDoc)
    for lyr in m.ListLayers(mxd):
        if lyr.supports('DATASOURCE'):
            if lyr.dataSource == fc_path:
                max_list = []
                rows = arcpy.SearchCursor(fc_path, "Sheet_ID")
                for row in rows:
                    max_list.append(row.Sheet_ID)
                pdf_count = max(max_list)
                arcpy.AddMessage('Sheet count:  %s'%pdf_count)
            else:
                arcpy.AddError('No layers in %s match data source'%mapDoc)
                
    # List all PDF's in the pdf_path folder
    arcpy.env.workspace = ws = pdf_path
    for pdf_doc in arcpy.ListFiles('*.pdf'):
        if ''.join([a for a in pdf_doc.split('.')[0] if a.isalpha()]) == mapDoc.replace('_',''):
            pdf = p.join(ws, pdf_doc)
            pdf_object = arcpy.mapping.PDFDocumentOpen(pdf)
            page_count = pdf_object.pageCount
            if page_count == pdf_count:
                arcpy.AddMessage('%s Sheet Count matches number of pages in %s' %(mapDoc,pdf_doc))
            else:
                report[mapDoc] = pdf_doc

# Report maps that do not match up
if len(report > 0):
    f = open(p.join(ws,'Page_Report.txt'))
    for k,v in report.iteritems():
        f.write('%s Sheet Count does not match number of pages in %s\n\n' %(k,v))
    f.close()
    arcpy.AddMessage('Created report for map documents and PDF\'s')
else:
    arcpy.AddMessage('All PDF\'s have the correct amount of pages')

Anonymous User · ‎03-04-2013

If your goal is to have all the PDF's into individual pages (for example if you want 11 indvidual PDF's for one map book) you could try this code:

#Purpose: Compares SheeID field to the toal PDFs created from Data Driven Pages Program

import arcpy, os, sys
from os import path as p
from arcpy import mapping as m

#Make parameters for people to choose mxd and folder of PDFs to compare
mxdList = arcpy.GetParameterAsText(0).split(';')
pdf_path = arcpy.GetParameterAsText(1)
fc_path = arcpy.GetParameterAsText(2)

report = {}
page_list = []
for mapDoc in mxdList:
    arcpy.mapping.MapDocument(mapDoc)
    for lyr in m.ListLayers(mxd):
        if lyr.supports('DATASOURCE'):
            if lyr.dataSource == fc_path:
                max_list = []
                rows = arcpy.SearchCursor(fc_path, "Sheet_ID")
                for row in rows:
                    max_list.append(row.Sheet_ID)
                sheet_count = max(max_list)
                arcpy.AddMessage('Sheet count:  {0}'.format(sheet_count))
            else:
                arcpy.AddError('No layers in %s match data source'%mapDoc)
                
    # List all PDF's in the pdf_path folder
    arcpy.env.workspace = ws = pdf_path
    for pdf_doc in arcpy.ListFiles('%s*.pdf' %mapDoc):
        page_list.append(''.join([i for i in pdf_doc.split('.')[0] if i.isdigit()]))
    page_count = max(page_list)
    if page_count == sheet_count:
        arcpy.AddMessage('%s Sheet Count matches number of pages in %s' %(mapDoc,pdf_doc))
    else:
        report[mapDoc] = pdf_doc

# Report maps that do not match up
if len(report > 0):
    f = open(p.join(ws,'Page_Report.txt'))
    for k,v in report.iteritems():
        f.write('%s Sheet Count does not match number of pages in %s\n\n' %(k,v))
    f.close()
    arcpy.AddMessage('Created report for map documents and PDF\'s')
else:
    arcpy.AddMessage('All PDF\'s have the correct amount of pages')

MichelleCouden1 · ‎03-04-2013

We actually have two scripts for the data driven pages one that does one PDF with all 11 pages inside, then we have the other which does 11 different PDFs. I work for the State they have very wierd requests at times. You have to cover your A##, and be ready for any and all wierd requests. Thanks for the correction. I'll work with the code and let you know.

MichelleCouden1 · ‎03-04-2013

I'm getting a mxd is not defined. I'm wondering if it is how I have the parameters set up. I have the user putting in mxd then feature class then pdf folder. Should I make it the same order as the code. mxd, pdf, fc.

import arcpy, os, sys
from os import path as p
from arcpy import mapping as m

#Make parameters for people to choose mxd and folder of PDFs to compare
mxdList = arcpy.GetParameterAsText(0).split(';')
pdf_path = arcpy.GetParameterAsText(1)
fc_path = arcpy.GetParameterAsText(2)

report = {}
page_list = []

#Use Search Cursor to go through Attribuite Table to get Sheet number info
for mapDoc in mxdList:
arcpy.mapping.MapDocument(mapDoc)
for lyr in m.ListLayers(mxd): (It stops right here with mxd error.

Anonymous User · ‎03-04-2013

I apologize I was very sloppy with my code I posted. It was untested and was full of errors. I just did a quick test and this should work:

#Purpose: Compares SheeID field to the toal PDFs created from Data Driven Pages Program

import arcpy, os, sys
from os import path as p
from arcpy import mapping as m

#Make parameters for people to choose mxd and folder of PDFs to compare
mxdList = arcpy.GetParameterAsText(0).split(';')
pdf_path = arcpy.GetParameterAsText(1)
fc_path = arcpy.GetParameterAsText(2)


report = {}
page_list = []
for mapDoc in mxdList:
    arcpy.AddMessage(mapDoc)
    mxd = arcpy.mapping.MapDocument(mapDoc)
    for lyr in m.ListLayers(mxd):
        if lyr.supports('DATASOURCE'):
            if lyr.dataSource == fc_path:
                max_list = []
                rows = arcpy.SearchCursor(lyr)
                for row in rows:
                    max_list.append(row.Sheet_ID)
                sheet_count = max(max_list)
                arcpy.AddMessage('Sheet count:  %s'%sheet_count)
            else:
                arcpy.AddError('No layers in %s match data source'%mapDoc)
                
    # List all PDF's in the pdf_path folder
    arcpy.env.workspace = ws = pdf_path
    for pdf_doc in arcpy.ListFiles('*.pdf'):
        page_list.append(int(''.join([i for i in pdf_doc.split('.')[0] if i.isdigit()])))
    page_count = max(page_list)
    if page_count == sheet_count:
        arcpy.AddMessage('%s Sheet Count matches number of pages in %s' %(mapDoc,pdf_doc))
    else:
        report[mapDoc] = pdf_doc

# Report maps that do not match up
if len(report) > 0:
    f = open(p.join(ws,'Page_Report.txt'))
    for k,v in report.iteritems():
        f.write('%s Sheet Count does not match number of pages in %s\n\n' %(k,v))
    f.close()
    arcpy.AddMessage('Created report for map documents and PDF\'s')
else:
    arcpy.AddMessage('All PDF\'s have the correct amount of pages')