How can I use python to convert ppt to pdf?

16337
8
Jump to solution
06-28-2016 02:31 PM
PaulHuffman
Occasional Contributor III

This isn't a GIS question,  but I thought maybe someone could help.  I have a couple folders of PowerPoint files that I need to convert to PDF files,  and I thought it would help me practice some python to see if I could script this task.  I found an example at http://odetocode.com/blogs/scott/archive/2013/06/26/convert-a-directory-of-powerpoint-slides-to-pdf-...

that I am using as a starting point.  If I can see this run once then I can extend it to loop through the whole folder and write to different output folder.

import sys

import os

import glob

import win32com.client

def convert(files, formatType = 32):

    powerpoint = win32com.client.Dispatch("Powerpoint.Application")

    powerpoint.Visible = 1

    for filename in files:

        newname = os.path.splitext(filename)[0] + ".pdf"

        deck = powerpoint.Presentations.Open(filename)       

        deck.SaveAs(newname, formatType)

        deck.Close()

    powerpoint.Quit()

files = glob.glob(os.path.join(sys.argv[1],"*.ppt?"))

convert(files)

First I tried running the script in IDLE, but first hit No module named win32.com,  then after I installed pywin extensions for python 27, 32 bit,  I got ImportError: No module named win32api.  Looking through posts at Geonet, I found that I was having this problem with win32 back in 2006,  and the easy way out is to run it in PythonWin.  I'm running ArcGIS 10.3.1 on a 64 bit Windows system,  so it looks like python 2.7.8, so I tried the install of win32 2.7 32 bit. 

Running the script in pythonwin made it easier to insert an input file name as argv[1],  and I didn't hit any of the win32 errors.  But it seemed to run to completion, giving "returned exit code 0" in the PythonWin bottom bar, but I can't find any output pdf file on my system.  What happened?

0 Kudos
1 Solution

Accepted Solutions
DarrenWiens2
MVP Honored Contributor

I'll confirm that this works for me:

import sys

import os

import glob

import win32com.client

def convert(files, formatType = 32):

    powerpoint = win32com.client.Dispatch("Powerpoint.Application")

    powerpoint.Visible = 1

    for filename in files:

        newname = os.path.splitext(filename)[0] + ".pdf"

        deck = powerpoint.Presentations.Open(filename)

        deck.SaveAs(newname, formatType)

        deck.Close()

    powerpoint.Quit()

files = glob.glob(r'PATH_TO_MY_PPTX') # <--- ONLY CHANGE

convert(files)

PyScripter, Python 2.7.5

View solution in original post

8 Replies
DanPatterson_Retired
MVP Esteemed Contributor

can you add a

print(files)

line after line 16 and add one after newname to ensure that things are being found and created... or have you done this already?

0 Kudos
PaulHuffman
Occasional Contributor III

Print(files) after 16 gave [],  remarked that out and tried print after line 10,  got nothing out.

This example script doesn't set the input or output path,  so I figured it might work if I just save and run the script in the same folder that contains the PowerPoint files,  .ppt?  Maybe I'll see if I can set the path then create a list of ppt? files. Oh, skip,  I guess I can't use arcpy to set the workspace or make the list because ppt isn't an arcpy format. 

0 Kudos
DarrenWiens2
MVP Honored Contributor

I'll confirm that this works for me:

import sys

import os

import glob

import win32com.client

def convert(files, formatType = 32):

    powerpoint = win32com.client.Dispatch("Powerpoint.Application")

    powerpoint.Visible = 1

    for filename in files:

        newname = os.path.splitext(filename)[0] + ".pdf"

        deck = powerpoint.Presentations.Open(filename)

        deck.SaveAs(newname, formatType)

        deck.Close()

    powerpoint.Quit()

files = glob.glob(r'PATH_TO_MY_PPTX') # <--- ONLY CHANGE

convert(files)

PyScripter, Python 2.7.5

View solution in original post

PaulHuffman
Occasional Contributor III

Thanks, Darren,  that's a lot like what I came up with on my own.  I set the path explicitly,  then got side tracked with a for loop to test the list contents,  lines 18 - 23, now commented out, and when that part showed me a good list of files,  I almost put lines 25- 27 in the for loop, but then thought, hey that's what the glob.glob part is supposed to do. (Still, I don't understand entirely how this magic works.  I'll read up on this.)  So I replaced the sys.argv[1] with my dir variable in line 25, and it ran.  PowerPoint opens up,  shows a little "publishing" progress bar for each input file,  then Powerpoint shuts down.  Your suggestion to change ppt? to ppt worked well to pick up both the old and the new file extensions.

Now if I can just direct the output files to a different folder, I'll be done.

import sys

import os

import glob

import win32com.client

def convert(files, formatType = 32):

    powerpoint = win32com.client.Dispatch("Powerpoint.Application")

    powerpoint.Visible = 1

    for filename in files:

        newname = os.path.splitext(filename)[0] + ".pdf"

        #print(files)

        deck = powerpoint.Presentations.Open(filename)       

        deck.SaveAs(newname, formatType)

        deck.Close()

    powerpoint.Quit()

dir =  r"C:\Users\Paul\Documents\web\ykfp\Par16\Presentations\YBSMC 2016 Thursday"

##filelist = []

##for file in [ ppt for ppt in os.listdir(dir)

##if ppt.endswith("ppt")]:

##    filelist.append(file)

##   

#print filelist

                        

files = glob.glob(os.path.join(dir,"*.ppt*"))

print(files)

convert(files)

0 Kudos
DarrenWiens2
MVP Honored Contributor

Oh, I think your problem is that '*.ppt?' doesn't match any 'ppt' files. '?' means one character, '*' means zero or more characters, so you could change to '*.ppt*' to pick up ppt and pptx.

PaulHuffman
Occasional Contributor III

I got it to work satisfactorily.  Now the pdf files are written to a different folder, outputdir.  I tried a couple times to get lines 13 -15 combined into one line, but I don't have time for that kind of elegance.  By default, the SaveAs overwrites files in the output directory if they already exist, which is fine by me.  I suppose if you wanted to avoid overwrite, you would have to use try-except somehow. 

import sys

import os

import glob

import win32com.client

dir =  r"C:\Users\Paul\Documents\web\ykfp\Par16\Presentations\YBSMC 2016 Thursday"

outputdir = r"C:\Users\Paul\Documents\web\ykfp\Par16\Presentations\pdf"

def convert(files, formatType = 32):

    powerpoint = win32com.client.Dispatch("Powerpoint.Application")

    powerpoint.Visible = 1

    for filename in files:

        newname = os.path.splitext(filename)[0] + ".pdf"

        newname = os.path.split(newname)[1]

        newname = os.path.join(outputdir,newname)

        deck = powerpoint.Presentations.Open(filename)       

        deck.SaveAs(newname, formatType)

        deck.Close()

    powerpoint.Quit()

                        

files = glob.glob(os.path.join(dir,"*.ppt*"))

#print(files)

convert(files)

0 Kudos
DarrenWiens2
MVP Honored Contributor

I wouldn't worry about the one-liner for something like that. It doesn't exactly help readability:

newname = os.path.join(outputdir,os.path.split(os.path.splitext(filename)[0] + ".pdf")[1]) # untested
PaulHuffman
Occasional Contributor III

Does it lack style to use newname three times and to use newname to define a new version of newname? 

0 Kudos