Python Code for Map Series PDF export with multiple layouts

Amp_Dev · ‎04-09-2024

Hi everyone!

I want to preface this by saying I am VERY new to coding so I could be doing this totally wrong. Also, please let me know if I'm posting this question in the wrong board.

In ArcGIS Pro, I wanted to be able to export a single PDF file consisting of Spatial Map Series Pages. However, I needed it to export two different layouts, alternating between layouts. For example, it would look like: FirstLayout Page 1, SecondLayout Page 1, FirstLayout Page 2, SecondLayout Page 2, FirstLayout Page 3, SecondLayout 3.... and so on.

The code I wrote works, but I have a few questions because I want to simplify it and somehow create a tool out of it so I don't have to input the code every time.

Here's the code with questions:

# coding: utf-8
# This code groups FirstLayout with SecondLayout and exports in one PDF file.

#FIRST QUESTION: do all of these need to be imported for this specific task? Does it even matter? I am still confused about the whole third-party module concept.

import arcpy
import os
import sys 

# Set the output PDF file path
output_pdf = r'C:\Users\BlahBlah\FilePath\Code Testing\Test.pdf'

#SECOND QUESTION: Is there a way to have the code auto fill layout names? 
Currently, I have to replace layout names every time I change it.

# List of layout names
layout_names = ['FirstLayout', 'SecondLayout']

# Reference the active ArcGIS Pro Project
aprx = arcpy.mp.ArcGISProject("CURRENT")

# Create the PDF document
pdf_document = arcpy.mp.PDFDocumentCreate(output_pdf)

#THIRD QUESTION: Here is where I get messed up. I should be able to reference the map series below, under "# Iterate through the unique IDs" BUT if I don't reference the map series before that step, I get this error message:
"<string>", line 1, in <module>
  File "C:\Program Files\ArcGIS\Pro\Resources\ArcPy\arcpy\arcobjects\_base.py", line 90, in _get
    return convertArcObjectToPythonObject(getattr(self._arc_object, attr_name))
RuntimeError: Unexpected error."-- Why?? 

#Reference the map series
for layout_name in layout_names:
        layout = aprx.listLayouts(layout_name)[0]
        map_series = layout.mapSeries

# Iterate through the unique IDs
for unique_id in range(1, map_series.pageCount + 1):
    for layout_name in layout_names:
        layout = aprx.listLayouts(layout_name)[0]
        map_series = layout.mapSeries
        map_series.currentPageNumber = unique_id
        pageName = map_series.pageRow.UNI_IDS__LO_  # UniqueID is the attribute field name used to define the map series. Change accordingly if a different field name is used.
        temp_pdf = r'C:\Users\BlahBlah\FilePath\Code Testing\TempLayout.pdf'
        layout.exportToPDF(temp_pdf)
        pdf_document.appendPages(temp_pdf)

#LAST QUESTION: Does this always have to be loaded after the code is done?

# After the software finishes the PDF, enter this.
pdf_document.saveAndClose()

Amp_Dev · ‎04-09-2024

Totally forgot! I also want to auto delete the templayout.

MErikReedAugusta · ‎04-09-2024

Quick answer, you can use either the os or pathlib module:

os.remove(temp_pdf)

General consensus is that pathlib is designed around path-like objects and is the better library to use for stuff like this. The code below assumes temp_pdf is a Path() object, instead of a regular string like you have in your code. My longer reply below is built around pathlib, if you want to see the difference.

temp_pdf.unlink()

I think pathlib is unavailable to Python 2, though, which means no pathlib in ArcGIS Desktop, if you're still using it. Though, if you are: You should definitely consider upgrading to Pro, anyway.

Also, for a more comprehensive answer comparing the two, I'm just going to link to this StackOverflow page: https://stackoverflow.com/questions/6996603/how-can-i-delete-a-file-or-folder-in-python

------------------------------
M Reed
"The pessimist may be right oftener than the optimist, but the optimist has more fun, and neither can stop the march of events anyhow." — Lazarus Long, in Time Enough for Love, by Robert A. Heinlein

Amp_Dev · ‎04-09-2024

@jcarlson @JohannesLindner Hope it's okay to tag you guys! I've looked at a lot of your posts and you two seem to be pretty knowledgeable so I wonder if either of you have any feedback! Please and thank you in advanced!

jcarlson · ‎04-09-2024

Oof. It's been a minute since I worked with layouts through ArcPy. Creating a complex map series, or anything with multiple pages in it, can be an absolute pain in the neck.

Personally? I just use QGIS for those kinds of things. I still don't really know the arcpy.mp module well enough to give you good guidance here, other than just avoid it unless you absolutely have to use it.

- Josh Carlson
Kendall County GIS

MErikReedAugusta · ‎04-09-2024

Line 4/Question 1:

The simple answer is, "If you didn't use it in your code, then it doesn't need to be up here." I highly recommend grabbing an IDE like PyCharm or Spyder (not IDLE), if you haven't already. They often have tools that will help you keep track of whether or not a library is actually in use by that code, and some will even insert the import statement for you if you call something you haven't imported, yet. I prefer PyCharm, but that's a highly subjective determination.

I'll come back to Questions 2-4 in a bit.

First, let's look at the general structure of your script:

import some stuff
Create a list of 2 items: "layout_names"
access the currently-open APRX file
create a blank PDF file
Line 25/Question 3: The reason you have to do this first is that PDFDocumentCreate appears to treat the PDF as essentially a list of pages. You can't append things to a list that doesn't exist, so you have to create an empty document, first, and then you can append pages to it.
Loop through layout_names.
Line 13/Question 2: I'm actually going to address this one here.
aprx.listLayouts(layout_name) will return a list of all Layouts that match the search string you put inside the parentheses. It's an optional argument, too.
Instead, you could theoretically just look for all layouts. If the only layout files in your APRX are FirstLayout and SecondLayout, then it'll automatically pull them, regardless of name.
The tricky part is going to be if you have more layouts and/or if your naming conventions sort the way you want to.
Having layout files "Option1" and "Layout2" would actually make "Layout2" the FirstLayout, since it comes alphabetically first, for example.
Having layout files "Layout1", "Layout2", "Layout3" would produce "Layout1-Page1, Layout2-Page1, Layout3-Page1, Layout1-Page2, [...]". It's up to you whether or not this is desirable.
Lines 31-33 are almost entirely redundant of Lines 38-40. They're going to be overwritten by the later lines and can be completely ignored & removed.
The only thing this functionally does is give you an iterable object that you can use in Line 37 to tell you how many pages to expect, but it's always going to be the Map Series for "SecondLayout", since the last iteration of this loop overwrites all previous ones.
It's worth noting that if SecondLayout has fewer pages than FirstLayout, you'll be missing the last pages of FirstLayout, because SecondLayout is entirely driving the bus on the loop.
Loop through each expected page and for each layout in layout_names (in that order), do the following:
List all layouts that match the search string from layout_names, and take the first result as "layout"
Read the map series in layout, and retrieve the current expected page number
Save that page as a single-page temporary PDF
Append that temporary PDF page to your PDF document that you created up in Step 4
Repeat

There appears to be a lot of redundancy & inefficiency in this script, to me. For one, that last loop: For every page of SecondLayout, you pull the whole map series, navigate to the matching page, and do some stuff. I don't know what the memory hit is for pulling the whole map series, but this sort of process in general is unideal.

First, here's the rough structure for how I would do this. I'm going to include actual Python code below, but if you want to try to dissect & understand this, see if you can write it, first, and then compare against how I did it.

Access the current APRX file
Pull a list of all Layout Files in that APRX and save it to layer_files.
Make sure you're saving the actual layer objects you get back from .listLayouts()—it'll save you from having to pull them up again later
If there's a consistent search string in the name that you can look for, it's worth putting it into the command. Otherwise, just pull them all and cross your fingers that you named them in a sortable way.
If this doesn't "just work", then you'll always need to provide the Layout Names, whether as a hardcoded parameter like this, or as an Input to a Tool. Either way, layer_files should always have layer objects inside it.
Create your blank PDF file for later
Create a master_page_count variable and set it to 0
Compare the page counts of all Layout Files from Step 2 against master_page_count, and save the highest page count you find to that variable

Spoiler

Quick Notes:
Line 1

Spoiler

If you're pasting it into the Python Window in ArcGIS, then you don't actually need to put "import arcpy". I usually still do, though, when I'm writing it outside in my IDE, so that my IDE can actually see it. If you're creating a Script Tool in ArcGIS, I genuinely can't remember if you need arcpy or not; I think you do. I always include it either way, again because of the IDE, and because it's generally good practice to manually include whatever you're calling, anyway.

Line 2

Spoiler

pathlib is a library designed specifically for working with paths. Explaining it is a bit beyond the scope of this post, but I'm trying to switch myself over to using this, instead of the old ways of doing things with os, glob, and the like.

Line 9

Spoiler

I'm just going to call my final PDF "ResultPDF.pdf" for simplicity's sake. There are a bunch of options for how you can handle this, here. For my workflows, it's usually simplest just to provide where I want it to save, let the tool name the file(s), and then rename things as-needed when it's all done.

Also, I've gone ahead and taken the string that the tool input gave me and converted it to a Path object to save me a step down the line.

Lines 14–27

Spoiler

The goal here is to create one master list of Layout objects. First, are all layouts that match searchString1. Then, all layouts that match searchString2. If something somehow matches both search strings, then it only gets added once.

Lines 23–78 are a failsafe. If we haven't found any Layout objects by this point, then we have a problem; nothing matched. If the user provided search strings (Line 24), then there were no matches; let them know with an Error message. If the user didn't provide search strings, then just export every single Layout file in the project.

Lines 35–36

Spoiler

Explaining the pathlib module is beyond the scope of this post, as stated above. The short version here: I create a Path object that points to a file called "Final_Output.pdf" in the directory provided in Line 9.

Line 45

Spoiler

This should be the correct syntax to delete the temporary PDF file. Full disclosure: I'm still getting a feel for pathlib and switching my code over to using it.

import arcpy
import pathlib

### TOOL IMPORTS ############################################################
#   If you're not making this a tool, replace arcpy.GetParameterAsText(#)
#   with the actual hardcoded value.
searchString1 = arcpy.GetParameterAsText(0)
searchString2 = arcpy.GetParameterAsText(1)
pdf_folder    = pathlib.Path(arcpy.GetParameterAsText(3))


### LAYOUT FILES, PAGE COUNT ################################################
aprx = arcpy.mp.ArcGISProject('CURRENT')
layout_files = []
if searchString1 != '':
  for layout in aprx.listLayouts(searchString1):
    if layout not in layout_files:
      layout_files.append(layout)
if searchString2 != '':
  for layout in aprx.listLayouts(searchString2):
    if layout not in layout_files:
      layout_files.append(layout)
if len(layout_files) <= 0:
  if searchString1 != '' or searchString2 != '':
    raise ValueError('Provided search strings returned no results.  Unable to export files')
  else:
    layout_files = aprx.listLayouts()

expected_page_count = 0
for layout in layout_files:
  if layout.mapSeries.pageCount + 1 > expected_page_count:
    expected_page_count = layout.mapSeries.pageCount + 1

### PREPARE PDF #############################################################
pdfDoc = pdf_folder / 'Final_Output.pdf'
pdfDoc = arcpy.mp.PDFDocumentCreate(pdfDoc)

### EXPORT! #################################################################
for pageNumber in range(1, expected_page_count):
  for layout in layout_files:
    layout.currentPageNumber = pageNumber
    tempPDF = pdf_folder / 'TEMP.pdf'
    layout.exportToPDF(tempPDF)
    pdfDoc.appendPages(tempPDF)
    tempPDF.unlink()
pdfDoc.saveAndClose()

Quick Notes:Line 1If you're pasting it into the Python Window in ArcGIS, then you don't actually need to put "import arcpy". I usually still do, though, when I'm writing it outside in my IDE, so that my IDE can actually see it. If you're creating a Script Tool in ArcGIS, I genuinely can't remember if you need arcpy or not; I think you do. I always include it either way, again because of the IDE, and because it's generally good practice to manually include whatever you're calling, anyway. Line 2pathlib is a library designed specifically for working with paths. Explaining it is a bit beyond the scope of this post, but I'm trying to switch myself over to using this, instead of the old ways of doing things with os, glob, and the like. Line 9I'm just going to call my final PDF "ResultPDF.pdf" for simplicity's sake. There are a bunch of options for how you can handle this, here. For my workflows, it's usually simplest just to provide where I want it to save, let the tool name the file(s), and then rename things as-needed when it's all done.Also, I've gone ahead and taken the string that the tool input gave me and converted it to a Path object to save me a step down the line. Lines 14–27The goal here is to create one master list of Layout objects. First, are all layouts that match searchString1. Then, all layouts that match searchString2. If something somehow matches both search strings, then it only gets added once.Lines 23–78 are a failsafe. If we haven't found any Layout objects by this point, then we have a problem; nothing matched. If the user provided search strings (Line 24), then there were no matches; let them know with an Error message. If the user didn't provide search strings, then just export every single Layout file in the project. Lines 35–36Explaining the pathlib module is beyond the scope of this post, as stated above. The short version here: I create a Path object that points to a file called "Final_Output.pdf" in the directory provided in Line 9.Line 45This should be the correct syntax to delete the temporary PDF file. Full disclosure: I'm still getting a feel for pathlib and switching my code over to using it. import arcpy import pathlib ### TOOL IMPORTS ############################################################ # If you're not making this a tool, replace arcpy.GetParameterAsText(#) # with the actual hardcoded value. searchString1 = arcpy.GetParameterAsText(0) searchString2 = arcpy.GetParameterAsText(1) pdf_folder = pathlib.Path(arcpy.GetParameterAsText(3)) ### LAYOUT FILES, PAGE COUNT ################################################ aprx = arcpy.mp.ArcGISProject('CURRENT') layout_files = [] if searchString1 != '': for layout in aprx.listLayouts(searchString1): if layout not in layout_files: layout_files.append(layout) if searchString2 != '': for layout in aprx.listLayouts(searchString2): if layout not in layout_files: layout_files.append(layout) if len(layout_files) <= 0: if searchString1 != '' or searchString2 != '': raise ValueError('Provided search strings returned no results. Unable to export files') else: layout_files = aprx.listLayouts() expected_page_count = 0 for layout in layout_files: if layout.mapSeries.pageCount + 1 > expected_page_count: expected_page_count = layout.mapSeries.pageCount + 1 ### PREPARE PDF ############################################################# pdfDoc = pdf_folder / 'Final_Output.pdf' pdfDoc = arcpy.mp.PDFDocumentCreate(pdfDoc) ### EXPORT! ################################################################# for pageNumber in range(1, expected_page_count): for layout in layout_files: layout.currentPageNumber = pageNumber tempPDF = pdf_folder / 'TEMP.pdf' layout.exportToPDF(tempPDF) pdfDoc.appendPages(tempPDF) tempPDF.unlink() pdfDoc.saveAndClose()

------------------------------
M Reed
"The pessimist may be right oftener than the optimist, but the optimist has more fun, and neither can stop the march of events anyhow." — Lazarus Long, in Time Enough for Love, by Robert A. Heinlein