XLS / CSV issues in a script

1585
10
11-22-2010 11:52 AM
NolanGeise
New Contributor
I'm very new to Python and haven't had any classes yet so I'm feeling my way around so far. Also if I'm in the wrong forum, please let me know.

I have a script that I am trying to build using python that takes a CSV and creates a shapefile, saves it to a specific location then moves the CSV to a backup folder. Sounds easy right?

I'm calling up the table to point tool from the Military Analyst toolbox which works but the issue that I'm running into is the formatting of the CSV. I have a field called ID that I need to be a text field, and is a text field if I open the CSV in Excel, but when I open it in Arc, it's a Double field. The field data looks like this in excel (s001, s002, v1, v2, s003a, s003b) and like this in Arc (1, 2, 1, 2, NULL, NULL).

When I manually open the CSV in Excel and save it out as an XLS file, it looks fine in Arc but I'm trying to do all of this automatically. The analyst puts the CSV in a "drop box" folder then my script grabs it, makes the shapefile and moves the CSV. So I guess I'm thinking I need to know how to either:

A.) fix the formatting issue and use the current CSV file and then figure out how to move it

or

B.) figure out how to have the script open excel, save the file as an xls do the processing then create a CSV and put it in the backup folder.

Thoughts?
Tags (2)
0 Kudos
10 Replies
ChrisSnyder
Regular Contributor III
When you are reading the .csv file, ca1n't you just force it to assume a string value?

lineValues = readLine.split(",")
for lineValues:
   idValue = str(lineValue[0]) #1st field, force it to be a string 
   otherValue = int(lineValue[1]) #2nd field, force it to be an integer
   anotherValue = float(lineValue[2]) #3rd field, force it to be a float


Another idea: Create a "companion" schema.ini file (http://msdn.microsoft.com/en-us/library/ms709353(VS.85).aspx so that it forces ArcMap to see the data how you want it.

What does your code and .csv file look like?
0 Kudos
ChrisMathers
Occasional Contributor III
Im not sure about using the csv library to read, but if you open a file straight up with open(file,'r') and split on the ',' like Chris said you should get a list of strings regardless of whats in the strings.

As for copying the file to a backup folder, you can use shutil.copyfile(source, destination) and then os.remove(source) which will copy your file and then delete the old one.

I would not go into opening excel and mucking with win32com if you can avoid it.
0 Kudos
NolanGeise
New Contributor
Thanks for the tips. I tried the "companion" schema and it seems to work. I am on a closed system so I will retype here what I put in there.

[filename.csv]
Format=CSVDelimited
Col1=ID Text Width  25
Col2=xxx Text Width 12
Col3=xxx Text Width 12
...

I noticed that when I put in Col1, I had to put in all of the columns which is ok becuase they are all static. My question no is, I need it to apply to all of the files that get placed in this folder and at the time it only has the name of the file I'm currently working on. Is there a way that I can have it pull the name(s) of any files that are placed in the folder so that the schema will apply to all files with a .csv extension in the folder?
0 Kudos
ChrisSnyder
Regular Contributor III
You would have to create a seperate .ini file for each .csv file (or at least make a one .ini, and then create multiple copies for each .csv). myfile2.csv needs a matching myfile2.ini

Maybe a better idea: Can't you just parse the .csv and force the data types using str(), int(), float(), etc? I assume you are using insert cursors to write the geometry, right?
0 Kudos
NolanGeise
New Contributor
Looks like I got it to work. I ended up creating the schema for the CSV that forced the formatting I wanted as suggested and then I put it in a temp folder. I pulled the CSV from the input folder and renamed it temp_file.csv and put it in the temp folder with the schema. I then did the processing on that file which fixed any problems with formatting. I then move the file to the backup folder and delete the input and temp file. Seems to work like a charm. Here's what it looks like (I'm having to retype this from another computer network so if I fat finger something, sorry)


# Import system modules...
import sys, os, re, shutil, arcgisscripting

# Create the Geoprocessor object...
gp=arcgisscripting.create()

# Load Required Toolboxes...
gp.AddToolbox(toolbox path)

# Variables...
input_folder = folder path
coordinate_format_type = "DMS 2 Field"
coordinate_field_1 = "Lat"
coordinate_field_2 = "Lon"
backup = folder path
output_folder = folder path

def table_convert(input_folder):
     for dirpath, dirnames, names in os.walk(input_folder, topdown=True):
          for doc in names:
               try:
                    output_file = os.path.join(backup, doc)
                    temp_file = os.path.join(temp_folder, �??temp_file.csv�?�)
                    input_file = os.path.join(dirpath, doc)
                    output_feature_class = os.path.join(output_folder, doc_split.group(1) + �??.shp�?�)
                    doc_split = re.search ("(.*)(\..*)", doc
                    if doc_split.group(2) == ".csv":
                         shutil.copy(input_file, temp_file)
                         gp.TableToPoint_ma(temp_file, output_feature_class, "DMS 2 Field", coordinate_field_1, coordinate_field_2, "", "", "")
                         shutil.copy(temp_file, output_file)
                         os.remove(input_file)
                         os.remove(temp_file)
               except:
                    gp.addmessage("Error in Process")

# Process...
table_convert
0 Kudos
NolanGeise
New Contributor
that didn't seem to format right... well here's a word doc in case anyone is intersted. I am very open to any suggestions/corrections to this. I am self-taught but really want to learn the "right" way to do things.
0 Kudos
ChrisSnyder
Regular Contributor III
Whatever works, but a sugestion: Always attempt to minimize dependencies (in your case the .ini file creation). Gald the .ini worked BTW, but there may be a more bullet-proof solution...

Seem's like you should be able to handle it in the "Table Convert" process (using type forcing like I mentioned above)... Can you post that part of your code?
0 Kudos
NolanGeise
New Contributor
Whatever works, but a sugestion: Always attempt to minimize dependencies (in your case the .ini file creation). Gald the .ini worked BTW, but there may be a more bullet-proof solution...

Seem's like you should be able to handle it in the "Table Convert" process (using type forcing like I mentioned above)... Can you post that part of your code?


I am all for suggestions... I kind of got the feeling that having a generic .ini file in there to force the correct formatting wasn't the "proper" or most efficient and stable way to do things.

Pardon my ignorance, but when you say type forcing and parsing the CSV, I am at a loss. The script that I put in the previous post is the whole script I wrote. I am just calling up the Table to Point tool from Military analyst to creat the shapefile so to be honest, I'm not sure what you are saying when you say "I assume you are using insert cursors to write the geometry, right?" I think I understand the concept of parsing as this; taking the csv and extracting the information from it and applying the format you want?
0 Kudos
MikeHunter
Occasional Contributor
At first glance, it seems easy to parse stuff out of a CSV file.  But as you've found, you quickly run into typing trouble.  What I learned is that using the built-in csv module is the easiest way to get what you want with a minimum of fuss.  And it's really easy to use:

import csv
ftxt='c:/temp/test.csv'
fcsv=open(ftxt,'r')
creader=csv.reader(fcsv)
for lst in creader:
    #etc

In the loop above, lst will be a python list of values for each row in the csv file.  And I'll bet you'll get the id's and other stuff as strings, as you need.  Then you can maybe use an update or insert cursor to put the values into the shapefile.  Give it a shot.

Mike
0 Kudos