Hey everyone,
Wanted to ask a generic but help question of the community. Currently I have been learning Python on my own and have just completed what I call my first "real world use" script for my company. It is a pretty basic script but am trying to get better. This script could be used by multiple people who don't know anything about python. I wanted to see for them, and myself if there are improvements that I could make to make this script more robust and "user friendly". I am using all this as a learning process.
Any help, ideas, opinions, comments would be fantastic.
-Matt
import time, sys, platform, imp, arcpy start = (time.strftime('%x %X')) # Start time ##### MUST EDIT THESE INPUTS ##### MUST EDIT THESE INPUTS ##### MUST EDIT THESE INPUTS ##### MUST EDIT THESE INPUTS ##### arcpy.env.workspace = 'insert path' # Location of output files newInput = 'insert path' # Input File (New Year) clipFeat = 'insert path' # Clip Feature eraseFeat = 'insert path' # Previous years merge x = '2014' # Change for current year year = str(x) # Input file year (Change only the year) outClip = 'r' + x + 'C' # Clip Output outLayer = 'Query_Layer' # Layer output (Used to run the query) outFc = 'Query_' + outClip # Shapefile output from the Layer file eraseOut = 'r' + x + 'E' # Erase output arcpy.env.overwriteOutput = 1 # Allows for output file overwriting messageList = [] # Message list for the log file log = 'insert path' # Creates a .txt doc with a log of the geoprocessing tools run (Edit path) ##### MUST EDIT THESE INPUTS ##### MUST EDIT THESE INPUTS ##### MUST EDIT THESE INPUTS ##### MUST EDIT THESE INPUTS ##### # Clip file to specified parameter arcpy.Clip_analysis(newInput, clipFeat, outClip) messageList.append(arcpy.GetMessages()) # Create a layer to add to Table of Contents newLyr = arcpy.MakeFeatureLayer_management(outClip, outLayer)[0] messageList.append(arcpy.GetMessages()) # Add year field and Calculate Field arcpy.AddField_management(outClip, "Year", "TEXT", 0, "", 10, "","NULLABLE", "NON_REQUIRED","") messageList.append(arcpy.GetMessages()) arcpy.CalculateField_management(outLayer, 'Year', year, "PYTHON_9.3", "") messageList.append(arcpy.GetMessages()) # Add Fields and run USDA QUERY on the Layer arcpy.AddField_management(outLayer, "BType", "TEXT", 0, "", 10, "","NULLABLE", "NON_REQUIRED","") messageList.append(arcpy.GetMessages()) arcpy.AddField_management(outLayer, "MPB_Only", "TEXT", 0, "", 10, "","NULLABLE", "NON_REQUIRED","") messageList.append(arcpy.GetMessages()) arcpy.AddField_management(outLayer, "Aspen_Dec", "TEXT", 0, "", 10, "","NULLABLE", "NON_REQUIRED","") messageList.append(arcpy.GetMessages()) # Query the data, based on the provided information from the USDA (Forest Service) # MPB arcpy.SelectLayerByAttribute_management(outLayer, "NEW_SELECTION", ' "DCA1" = 80003 OR "DCA2" = 80003 OR "DCA3" = 80003 OR "DCA1" = 11006 OR "DCA2" = 11006 OR "DCA3" = 11006 ') messageList.append(arcpy.GetMessages()) arcpy.CalculateField_management(outLayer, "MPB_Only", 'str("MPB")', "PYTHON_9.3", "") arcpy.SelectLayerByAttribute_management(outLayer, "CLEAR_SELECTION") # LP arcpy.SelectLayerByAttribute_management(outLayer, "NEW_SELECTION", ' ("DCA1" = 11006 AND "HOST1" = 108) OR ("DCA2" = 11006 AND "HOST2" = 108) OR ("DCA3" = 11006 AND "HOST3" = 108) ') messageList.append(arcpy.GetMessages()) arcpy.CalculateField_management(outLayer, "BType", 'str("LP")', "PYTHON_9.3", "") arcpy.SelectLayerByAttribute_management(outLayer, "CLEAR_SELECTION") # PP arcpy.SelectLayerByAttribute_management(outLayer, "NEW_SELECTION", ' ("DCA1" = 11006 AND "HOST1" = 122) OR ("DCA2" = 11006 AND "HOST2" = 122) OR ("DCA3" = 11006 AND "HOST3" = 122) ') messageList.append(arcpy.GetMessages()) arcpy.CalculateField_management(outLayer, "BType", 'str("PP")', "PYTHON_9.3", "") arcpy.SelectLayerByAttribute_management(outLayer, "CLEAR_SELECTION") # 5N arcpy.SelectLayerByAttribute_management(outLayer, "NEW_SELECTION", ' ("DCA1" = 80003) OR ("DCA2" = 80003) OR ("DCA3" = 80003) ') messageList.append(arcpy.GetMessages()) arcpy.CalculateField_management(outLayer, "BType", 'str("SN")', "PYTHON_9.3", "") arcpy.SelectLayerByAttribute_management(outLayer, "CLEAR_SELECTION") # SB arcpy.SelectLayerByAttribute_management(outLayer, "NEW_SELECTION", ' ("DCA1" = 11009) OR ("DCA2" = 11009) OR ("DCA3" = 11009) ') messageList.append(arcpy.GetMessages()) arcpy.CalculateField_management(outLayer, "BType", 'str("SB")', "PYTHON_9.3", "") arcpy.SelectLayerByAttribute_management(outLayer, "CLEAR_SELECTION") # DFB arcpy.SelectLayerByAttribute_management(outLayer, "NEW_SELECTION", ' ("DCA1" = 11007) OR ("DCA2" = 11007) OR ("DCA3" = 11007) ') messageList.append(arcpy.GetMessages()) arcpy.CalculateField_management(outLayer, "BType", 'str("DFB")', "PYTHON_9.3", "") arcpy.SelectLayerByAttribute_management(outLayer, "CLEAR_SELECTION") # WBBB arcpy.SelectLayerByAttribute_management(outLayer, "NEW_SELECTION", ' ("DCA1" = 80002) OR ("DCA2" = 80002) OR ("DCA3" = 80002) ') messageList.append(arcpy.GetMessages()) arcpy.CalculateField_management(outLayer, "BType", 'str("WBBB")', "PYTHON_9.3", "") arcpy.SelectLayerByAttribute_management(outLayer, "CLEAR_SELECTION") # WSB arcpy.SelectLayerByAttribute_management(outLayer, "NEW_SELECTION", ' ("DCA1" = 12040) OR ("DCA2" = 12040) OR ("DCA3" = 12040) ') messageList.append(arcpy.GetMessages()) arcpy.CalculateField_management(outLayer, "BType", 'str("WSB")', "PYTHON_9.3", "") arcpy.SelectLayerByAttribute_management(outLayer, "CLEAR_SELECTION") # WPB arcpy.SelectLayerByAttribute_management(outLayer, "NEW_SELECTION", ' ("DCA1" = 11002) OR ("DCA2" = 11002) OR ("DCA3" = 11002) ') messageList.append(arcpy.GetMessages()) arcpy.CalculateField_management(outLayer, "BType", 'str("WPB")', "PYTHON_9.3", "") arcpy.SelectLayerByAttribute_management(outLayer, "CLEAR_SELECTION") # Aspen Decline arcpy.SelectLayerByAttribute_management(outLayer, "NEW_SELECTION", ' ("DCA1" = 80001) OR ("DCA2" = 80001) OR ("DCA3" = 80001) OR ("DCA1" = 24032) OR ("DCA2" = 24032) OR ("DCA3" = 24032) ') messageList.append(arcpy.GetMessages()) arcpy.CalculateField_management(outLayer, "Aspen_Dec", 'str("AD")', "PYTHON_9.3", "") arcpy.SelectLayerByAttribute_management(outLayer, "CLEAR_SELECTION") # Creates a Shape file from the layer arcpy.CopyFeatures_management(outLayer, outFc) messageList.append(arcpy.GetMessages()) # Erase arcpy.Erase_analysis(outFc, eraseFeat, eraseOut) messageList.append(arcpy.GetMessages()) # Writes information to log file f = open(log, 'w') # Log file location f.write('Created by: Matt Russo' + '\n') f.write('Email: matthewrusso1986@gmail.com' + '\n') f.write('Architecture : ' + platform.architecture()[0] + '\n') f.write('Python EXE : ' + sys.executable + '\n') f.write('Path to arcpy : ' + imp.find_module('arcpy')[1] + '\n' +'\n') f.write('Start time was: ' + start + '\n') end = (time.strftime('%x %X')) # End time f.write('End time was: ' + end + '\n' + '\n') for message in messageList: f.write(message) f.write('\n' + '\n') f.close()
The PEP8 Style Guide is a great place to start. Now that you a familiar with the syntax, the style guides will make more sense.
From the start, the style guides recommend putting each module you import on its own line. So instead of this
import time, sys, platform, imp, arcpy
Do this
import arcpy import imp import platform import sys import time
Inline comments are helpful but can be overdone, so be careful. They should also be separated from the code line by at least two spaces (you only have one).
Consider the maximum line length. Super long lines of code are very hard (for a human) to interpret and arcpy functions are notorious for turning into monsters. Try declaring more of your function arguments as variables and take advantage of the multi-line capabilities within the parenthesis.
Instead of this
arcpy.SelectLayerByAttribute_management(outLayer, "NEW_SELECTION", ' "DCA1" = 80003 OR "DCA2" = 80003 OR "DCA3" = 80003 OR "DCA1" = 11006 OR "DCA2" = 11006 OR "DCA3" = 11006 ')
Do this
where_clause = ' "DCA1" = 80003 OR "DCA2" = 80003 OR "DCA3" = 80003 OR "DCA1" = 11006 OR "DCA2" = 11006 OR "DCA3" = 11006 ' arcpy.SelectLayerByAttribute_management( outLayer, "NEW_SELECTION", where_clause )
Or you can take advantage of multiline strings and do something like this for the where clause. I used triple double quotes (""") but those are typically used for docstrings so I try to use triple single quotes ('''). However, the GeoNet forums do funny things with triple single quotes in Python syntax highlighting so I had to write this with triple double quotes.
where_clause = """ "DCA1" = 80003 OR "DCA2" = 80003 OR "DCA3" = 80003 OR "DCA1" = 11006 OR "DCA2" = 11006 OR "DCA3" = 11006 """
But I would go even further and actually improve the SQL for the where clause and do this
where_clause = """ "DCA1" IN(8003,11006) OR "DCA2" IN(8003,11006) OR "DCA3" IN(8003,11006) """
It may also help to declare some smaller local variables at the beginning of the code block it will be used in rather than everything at the beginning. Or you can use comments to group the variables by use, like "connections", "tables", "feature classes", etc.
When opening files to read and write (like you've done at the end with the log file), it's better practice to use a with statement so the file is always closed, even if there was an error. If your current code errored after you opened the file and before you closed it, the file would not get closed.
Finally, a lot of Python IDEs style two types of comments differently: # and ##
I like to use the single hash comment (#) for the heading of a large block of code and the double hash (##) for headings under the main heading or for inline comments. I also try to put one line between the main blocks of code and no lines between code that is all together so you can visually group the major blocks of logic. If a block of code is too long to easily fit this style, then it might be better off written as a function and called from your main().
I'm still learning Python myself and started just like you, Matthew. Please correct me if my assumptions are wrong!
Blake T,
Thanks for the response, this is the exact information that I am looking for. Right now I am at the point where I can get things to "work". I want to get more in line with the right styles and methods of doing things.
The inline comments are mainly for the other people who may not have coding experience to help understand what is going on. My personal version of this contains far less comments.
Also great information on the import situation. I always wondered why most people did not use the multiple import method.
Thanks!
Good suggestions by Blake T.
Some more (though not specifically for your script) can be found here: Some Python Snippets
Have you considered hiding all code from your users by using your script as a script tool? http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//001500000006000000.htm
Then it's a matter of getting your users to enter parameters via the tool dialog, and accessing those parameters with arcpy.GetParameterAsText().
Yes Darren,
My orginal script was in a Python Toolbox. This is my plan maybe later down the road but it seems like everytime the new data comes out, there is a change so I have to adjust something.
Matt