Automatically add PDF metadata for accessibility after export with pikePDF

Teresa_Blader · ‎09-22-2025

PikePDF is a conda-forge library that would allow me to write a csv file of the file paths, file names, titles, subjects, authors, and keywords and overwrite my PDF maps. I have about 70 maps that I export regularly to different locations, and need to make WCAG 2.1 compliant. Since there is no way currently to model this in arcpy and also include the accessibility tags, I will have to export one at a time. But if I can get pikePDF to work, that at least keeps me from having to write the meta data over and over again.

Has anyone successfully cloned their arcgis python environment and installed pikePDF?

I'm currently at 3.3.4 and when I did this, my ArcGIS Pro environment crashes. I'm not sure why. Maybe it's not compatible??

As a note, I couldn't clone my library until our network team allowed conda through the firewall.

https://anaconda.org/conda-forge/pikepdf

Teresa Blader
Olmsted County GIS
GIS Analyst - GIS Solutions

BrennanSmith1 · ‎09-29-2025

It doesn't directly answer your question, but see this thread for an example of using pypdf to write metadata to a pdf after export from Pro. You might be able to accomplish your task without having to install pikePDF.

Teresa_Blader · ‎09-29-2025

I did read that earlier, but it sounds like I wouldn't be able to reference a csv with all the metadata in it and that I'd have to write it into the python script? It looks like I'm writing out the title, author, subject, keywords right into the python script yes? And this is for a map series? Whereas I'm not working with a map series in this case.

I'm super new to python so that script looked pretty intimidating haha

Teresa Blader
Olmsted County GIS
GIS Analyst - GIS Solutions

BrennanSmith1 · ‎09-30-2025

You would load the CSV as a dataframe, then loop through it to get the values you need to modify the PDFs. Below is a more direct example for your workflow. I haven't tested it directly but it should work, you just need to tweak your csv column names.

import pandas as pd
from pypdf import PdfWriter, PdfReader

## Make sure your csv has simple column names without spaces
## This will make it easier to access them from a named tuple later
## In this example, I am assuming csv columns named:
    # filepath
    # title
    # author
    # subject
    # keywords

#define your csv and load as dataframe
csv_file = r"path/to/file.csv"
df = pd.read_csv(csv_file)

#iterate over the rows
for row in df.itertuples():
    # you can now access values using row.columnname

    # open pdf
    reader = PdfReader(row.filepath)
    writer = PdfWriter(clone_from=reader)
    
    #write metadata
    writer.add_metadata({"/Title": row.title
                         "/Author": row.author,
                         "/Subject": row.subject,
                         "/Keywords": row.keywords})
    
    #save pdf
    with open(row.filepath, "wb") as f:
        writer.write(f)