What's in a Data Type:  When FeatureClass = String

Blog Post created by bixb0012 Champion on Aug 19, 2014

Anyone who has dabbled in ArcPy is likely familiar with the ArcPy and Tool reference sections of the ArcGIS Help.  After all, those sections are where the functionality of ArcPy classes and ArcGIS Tools are documented, including descriptions, syntax, and examples.  As much as there is plenty of consistency between the look and feel of those two sections, there is an important inconsistency in the Syntax tables of those two sections.  Although the inconsistency isn't enough to trip up someone who regularly writes code, I regularly see it cause confusion among those who are just beginning to script with ArcPy and ArcGIS Tools.


Let's get a couple examples on deck.  I will start with a Syntax screenshot from the ListLayers function of the arcpy.mapping module, and


follow it up with a partial screenshot of the Syntax for the Dissolve tool in ArcGIS Desktop.


At first glance, it is easy to notice the consistency between the look and feel of the Syntax tables, e.g., both tables have the same formatting style and column headers.  There is value in consistency, especially in documentation, but consistency in style doesn't always equate to consistency in content, and this is where I see beginner scripters stumble when reading through Esri documentation.  The Syntax tables in the  ArcPy and Tool reference sections of the ArcGIS Help share the same column headings, but the content of the Data Type column differs between sections.


I posted the Syntax screenshot from the ListLayers function first because 'Data Type' in the ArcPy section is consistent with what a vast majority of people think when discussing programming/scripting and data types.  For example, the data type for the map_document_or_layer parameter is listed as Object, and the explanation column states it needs to be a variable with a reference to a MapDocument  or Layer object.  The wildcard parameter is listed as being a String, and the data_frame parameter is listed as being an arcpy.mapping DataFrame object.


It is interesting to note the data_frame parameter has a specific data type given while the map_document_or_layer parameter has a generic Object data type given.  My guess is that since the former parameter accepts a single object type while the latter accepts two different object types, someone made a judgment call to go with the more generic Object data type instead of listing all of the applicable object types in the Data Type column.  Fair enough.


As mentioned above, the formatting style for Syntax is identical between the ArcPy and Tool reference sections of the ArcGIS Help, right down to the column headings.  Whereas 'Data Type' in the ArcPy section is consistent with general programming usage, 'Data Type' in the Tool reference section means something a bit different, a bit muddled in my opinion.  For users just starting out with Python and ArcPy scripting, looking at the Dissolve Syntax table might lead them to believe the in_features parameter accepts a Feature Layer object and the out_feature_class parameter accepts a Feature Class object.  Unfortunately, they would be wrong, or wrong enough to be confused.  Let's see if some sample code gives any clarification.


That's interesting, every parameter in the sample code is a string or list of strings.  If the out_feature_class parameter has a data type of Feature Class, why would the sample code be passing it a string?   Are we missing something?  Maybe the Understanding tool syntax help page has some answers.  Looking at Data Type:


The first paragraph makes sense, i.e., there are simple data types like strings and integers and more complex data types like arcpy objects.  The second paragraph is where things get interesting:  "Tool parameters are usually defined using simple text strings."  Huh, so parameters have data types but all data types are 'usually defined using simple text strings.'  A string is a string but so is a Feature Class.  Interesting.  If one follows the data type link visible in the screenshot above, a bit more explanation can be found in the Data types for geoprocessing tool parameters page.


As best I can tell, the Syntax tables in the Tool reference section give a Data Type, like in the ArcPy section, but it isn't really a data type the way most people would think of a data type when programming Python.  Just like a picture of a table is different than the table itself, a string representation of an object isn't the same as the object and their data types aren't one of the same either.  What comes my mind is the difference in databases between data types and data domains.  A column containing an 'M' or 'F' for gender still has a data type of string, even if the string represents the gender of an individual.


I don't see an issue with using string representations of objects, after all there is a lot more overhead with passing object or object references than strings, but don't overload the meaning of a commonly understood term just so column headings can be the same between two different sections in the manual.  Consistency has value but it shouldn't come at the expense of correctness.