Select to view content in your preferred language

Batch import Excel files for processing

5240
3
08-12-2011 07:36 AM
RoyHewitt
Deactivated User
Hello all,

I'm trying to write a script tool for one of our field biologists to process data.  XY data is contained in several excel files all saved to the same folder.  I'm using the table to table tool (Conversion tools) followed by Make XY Event Layer (Data Management).

I've tried the ListTables() command, but it isn't finding any of the .xls files.  Here is an example of my code.


import arcpy
arcpy.env.workspace = "C:/Documents and Settings/RHewitt/Desktop/SAV Report files 2010/"
outPath = "C:/Documents and Settings/RHewitt/Desktop/SAV Report files 2010/Output/"


# Local variables:
xVal = "Longitude"
yVal = "Latitude"
projWGS = "GEOGCS['GCS_WGS_1984',DATUM['D_WGS_1984',SPHEROID['WGS_1984',6378137.0,298.257223563]],PRIMEM['Greenwich',0.0],UNIT['Degree',0.0174532925199433]];-400 -400 1000000000;-100000 10000;-100000 10000;8.98315284119521E-09;0.001;0.001;IsHighPrecision"

# List all files in folder, then list all tables in file
fileList = arcpy.ListFiles()
tableList = arcpy.ListTables()

try:
    # Loop through each table
    for file in fileList:
        for table in tableList:
            # Convert excel to DBF, make XY event layer, save to output
            arcpy.TableToTable_conversion(table, outpath, table)
            arcpy.MakeXYEventLayer_management(table, xVal, yVal, "now_" + table, projWGS)
            arcpy.CopyFeatures_management("now_" + table, outpath + table)
except:
    print arcpy.GetMessages()


Thanks,
Roy
Tags (2)
0 Kudos
3 Replies
JakeSkinner
Esri Esteemed Contributor
You will need to specify which files to search for in the 'arcpy.ListFiles' function.  Also, you will need to specify the Sheet # when executing the 'arcpy.TabletoTable_conversion' tool when working with Excel files.  Ex:

lstFiles = arcpy.ListFiles("*.xls")
for file in lstFiles:
    arcpy.TableToTable_conversion(file + "\\" + "Sheet1$", outpath, "test.dbf")
0 Kudos
HILLARYHALL
Emerging Contributor
Recently, I have just finished my excel processing application, which acts as one of the most important functions of excel document reading, edting and saving sdk package.  I am not sure which kind of specific excel processing function you are prorgamming. Broadly speaking, the excel document managing and editing library usually contains the basic excel spreadsheets handling, such as excel page inserting, deleting, sorting, and excel documents splitting and merging as well as adding image to excel page and extracting one or more excel pages.
0 Kudos
KimOllivier
Honored Contributor
At ArcGIS 10.2.1 there are new python modules included called XLRD and XLWT which are excellent modules for reading spreadsheets and extracting the data. It can handle multiple sheets and can pick out any cell or range of cells.

Don't bother trying to use the new Excel to Table tool unless the data is already in a suitable format with valid field names etc.

Here is an example of reading the 4th sheet that was not formatted for a database, no header field names and lots of comments all over the sheet. I needed three columns in the middle of the sheet.

By the way watch for new uses of Sqlite with spatial extensions as Spatialite and the OSGEO recent standard
http://www.opengeospatial.org/standards/geopackage

#-------------------------------------------------------------------------------
# Name:        loadXLS
#               read Table 4 from 68 spreadsheets to get Unoccupied Dwellings
# Purpose:      read Stats spreadsheets into a Sqlite database that cannot
#               be read by FME, ArcGIS or anything other than Excel
#               saves writing out to CSV first
# Author:      kimo
#
# Created:     08/05/2014
# Copyright:   (c) kimo 2014
# Licence:     Creative Commons
#-------------------------------------------------------------------------------
import xlrd # installed with ArcGIS 10.2 by default
import sys,os
import re
import glob
import sqlite3
import arcpy # only needed for final message

pat = re.compile("^[0-9]{6}") # pattern of AU code in AU name in Column 0 ("A")
dType = {0:'Empty', 1:'Text', 2:'Number', 3:'Date', 4:'Boolean', 5:'Error', 6:'Blank'}

def extract_tla(filename):
    '''extract Unoccupied Dwellings'''
    workbook = xlrd.open_workbook(filename)
    worksheet = workbook.sheet_by_name('Table 4')
    num_rows = worksheet.nrows
    num_cells = worksheet.ncols
    rec = 0
    for curr_row in range(num_rows):
        curr_col = 0
        cellAn = worksheet.cell_value(curr_row, curr_col)
        cell_type0 = worksheet.cell_type(curr_row, curr_col)
        if cell_type0 == 1 and pat.match(cellAn):
            a = str(worksheet.cell_value(curr_row,curr_col))
            val = [a[0:6],a[7:]]
            for curr_col in range(1,4):
                count = worksheet.cell_value(curr_row,curr_col)
                if count == '-':
                    n = 0
                else :
                    n = int(count)
                val.append(n)
            rec+=1
            #print val
            insert_ud_row(udTable,val)
    # print "Records found",rec
    # fields au2013,auname,ud2001,ud2006,ud2013
    return rec

def create_ud_tab(conn,udTable):
    # (re-)create a table
    c = conn.cursor()
    c.execute('''DROP TABLE IF EXISTS '''+ udTable)
    c.execute('''CREATE TABLE '''+udTable+''' (
        au       text NOT NULL,
        auname       text ,
        ud2001       integer,
        ud2006       integer,
        ud2013       integer )''')
    conn.commit()
    c.close()
    return

def insert_ud_row(outTab,row):
    '''insert a row
    requires a sqlite database to be connected
    and a cursor to be open
    '''
    try:
        cmd = "INSERT INTO " + outTab + " VALUES (" + (",?"*(len(row)))[1:] + ")"
        # implied transaction already opened
        c.execute(cmd,row)
    except Exception,msg:
        print msg
        print row
        print cmd
        conn.rollback()
        conn.close()
        sys.exit()

# -------------------------------------------------------------------------
if __name__ == '__main__':
    try:
        source_folder = sys.arg[1]
    except:
        source_folder = 'C:/data/census2013/source'
    os.chdir(source_folder)
    conn = sqlite3.connect('C:/data/census2013/dwelling.sqlite')
    udTable = 'DwellingEmptyAU'
    create_ud_tab(conn,udTable)
    # Open a cursor and insert records from all the spreadsheets
    c = conn.cursor()
    print "sqlite database successfully opened"
    trec = 0
    lstXLS = glob.glob('*.xls')
    for x in lstXLS:
        rec = extract_tla(x)
        trec+=rec
        print rec,x
    conn.commit()
    c.close()
    conn.close()
    msg = '{} Total records inserted into {}'.format(trec,udTable)
    print msg
    arcpy.AddMessage(msg)
0 Kudos