Script for Making Individual Zip file for Each Shapefile?

6594
16
05-29-2013 05:47 AM
JacobCoble
New Contributor
I posted this in the Geoprocessing forum and then I thought I should post it here - sorry if this is redundant.

I need to take all of the shapefiles in a directory and compress them into individual zip files. I need to take, for example, the roads shapefile and get all of the files named roads.* into roads.zip, the schools shapefile, schools.* into schools.zip, and so on, for hundreds of shapefiles.
Tags (2)
0 Kudos
16 Replies
JacobCoble
New Contributor
Sorry to keep bugging the forum about this. I get the error, "NameError: global name 'zipflie' is not defined". And I am running this in ArcGIS 9.3.1. Unfortunately our shop is still using 9.3.1 on some machines and 10.0 on others so I have to support some scripts for both.

import sys, os, string, arcgisscripting, fnmatch
from os import path as p
import zipfile

# Create the Geoprocessor object
gp = arcgisscripting.create()

# arcpy.overwriteOutput = True
gp.overwriteoutput=1

def ZipShapes(path, out_path):
    gp.workspace = path
    shapes = gp.ListFeatureClasses("*")

    # iterate through list of shapefiles
    #for shape in shapes:
    for shape in iter(shapes.next, None):
        name = p.splitext(shape)[0]
        print name
        zip_path = p.join(out_path, name + '.zip')
        zip = zipfile.ZipFile(zip_path, 'w', zipflie.ZIP_DELFLATED)
        for path, dirs, files in os.walk(path):
            for f in files:
                if fnmatch.fnmatch(f, '%s*' %shape):
                    zip.write(p.join(path,f), f)
        print 'All files written to %s' %zip_path
        zip.close()

if __name__ == '__main__':

    path = r'T:\\cotiss\\CobleJ\\shape2zip\\address'
    outpath = r'T:\\cotiss\\CobleJ\\shape2zip\\Shape_outputs'

    ZipShapes(path, outpath)
0 Kudos
by Anonymous User
Not applicable
I get the error, "NameError: global name 'zipflie' is not defined".


Apparently you mistyped 'zipflie.ZIP_DEFLATED' when it was supposed to be zipfile.ZIP_DEFLATED. Fix this line:

zip = zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DELFLATED)
0 Kudos
JacobCoble
New Contributor
Well, it's very close! Just one problem. It will only make one complete zip file out of the first shapefile in my source directory.

Suppose my folder has four shapefiles (this is the folder called "address"). Let's say the shapefiles are named parcels.* and parcel_join.* and pointaddr.* and zipcodes.* - in my output folder "Shape_outputs" there will be four zip files. The first zip file of the first shapefile (parcels.zip will be okay, it has all of the files making up the parcels shapefile). The next two zip files will be empty. The last zip file, zipcodes.zip, will only contain one file from the zipcodes shapefile - it has the zipcodes.shp file but not the zipcodes.dbf, .sbn, .sbx, etc. It takes a few seconds to make the first zip file judging from the message it prints, then in a flash it prints the names of the other three files and the script stops without an error message.

import sys, os, string, arcgisscripting, fnmatch
from os import path as p
import zipfile

# Create the Geoprocessor object
gp = arcgisscripting.create()

# arcpy.overwriteOutput = True
gp.overwriteoutput=1

def ZipShapes(path, out_path):
    gp.workspace = path
    shapes = gp.ListFeatureClasses("*")

    # iterate through list of shapefiles
    #for shape in shapes:
    for shape in iter(shapes.next, None):
        name = p.splitext(shape)[0]
        print name
        zip_path = p.join(out_path, name + '.zip')
        zip = zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DEFLATED)
        for path, dirs, files in os.walk(path):
            for f in files:
                if fnmatch.fnmatch(f, '%s*' %shape):
                    zip.write(p.join(path,f), f)
        print 'All files written to %s' %zip_path
        zip.close()

if __name__ == '__main__':

    path = r'T:\\cotiss\CobleJ\shape2zip\address'
    outpath = r'T:\\cotiss\CobleJ\shape2zip\Shape_outputs'

    ZipShapes(path, outpath)
0 Kudos
by Anonymous User
Not applicable
Okay, I think I got it now. I cannot test on 9.3 but this did work on 10.1 with arcgisscripting. I went with the glob method instead and it got all the files:

import arcgisscripting, os, glob, zipfile
from os import path as p

# Create the Geoprocessor object
gp = arcgisscripting.create()

# arcpy.overwriteOutput = True
gp.overwriteoutput=1

def ZipShapes(path, out_path):
    gp.workspace = path
    shapes = gp.ListFeatureClasses("*")

    # iterate through list of shapefile
    for shape in iter(shapes.next, None):
        name = p.splitext(shape)[0]
        print name
        zip_path = p.join(out_path, name + '.zip')
        zip = zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DEFLATED)
        os.chdir(path)
        for files in glob.glob('%s*' %name):
            zip.write(p.join(path,files), files)
        print 'All files written to %s' %zip_path
        zip.close()

if __name__ == '__main__':

    path = r'T:\\cotiss\CobleJ\shape2zip\address'
    outpath = r'T:\\cotiss\CobleJ\shape2zip\Shape_outputs'

    ZipShapes(path, outpath)
0 Kudos
JacobCoble
New Contributor
Caleb, thanks for your help! There is just one more quirk. If I have shapefiles with similar names, it will make one zip file for each shapefile but the zip files will have all of the shapefiles with similar names instead of just the shapefile with the exact name. I have two shapefiles, called "parcel" and "ParcelAd" and it makes two zip files with both names. But if I open parcel.zip it has everything for both parcel.* and ParcelAd.* and if I open ParcelAd.zip it will have everything for both parcel.* and ParcelAd.*
0 Kudos
T__WayneWhitley
Frequent Contributor
...minor clarification on this line I think will fix your problem:
for files in glob.glob('%s.*' %name):


For this minor but annoying quirk, you almost answered your own question when you said "...it has everything for both parcel.*..." --- notice you inserted the period. Well you have to do the same in the code. If you leave off the period it not only 'globs' the component files making up your shapefile but if other shapefiles contain the same root name i.e. root + file ext or root + suffix + file ext, then both sets of file components are returned -- in other words, the wildcard allows it. So to 'tighten' the wildcard filter you need the period. Also, careful with the xml component file...

Powerful subtlety... I've stumbled on this as well.

Enjoy,
Wayne
0 Kudos
JacobCoble
New Contributor
Success! Thanks so much Caleb, Jason and Wayne. I ran both the 10.0 and the 9.3.1 scripts and everything is fine - they no longer get "confused" by files with similar names, and it makes sense.

So on the 9.3.1 script the line should be
        for files in glob.glob('%s.*' %name):
and in the 10.0 script the equivalent line should be
        for f in arcpy.ListFiles('%s.*' %name):

Thanks!
0 Kudos