Though the documentation refers to a "Describe object", it must be some sort of C class (apparently via the arcgisscripting.create function), instead of a Python class. The standard Python introspection methods don't turn up much information (see below).
If one wishes to summarize all the known info of a dataset, how might one dynamically identify which properties apply to the current instance--without writing a long series of if statements and/or going through all the potential properties?
For reference: given a variable named "desc" returned from a call to arcpy.Describe, here are the results of some inspection and introspection operations:
>>> type(desc)
<type 'geoprocessing describe data object'>
>>> dir(desc)
[]
>>> desc.__name__
'Describe Object'
>>> desc.__class__
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
AttributeError: DescribeData: Method __class__ does not exist
>>> desc.__repr__
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
AttributeError: DescribeData: Method __repr__ does not exist
>>> isinstance(desc, object)
True
>>> desc.__dict__
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
AttributeError: DescribeData: Method __dict__ does not exist
>>> help(desc)
Help on geoprocessing describe data object object:
Describe Object = class geoprocessing describe data object(object)
>>> print(inspect.getmodule(desc))
None
>>> inspect.getclasstree(desc)
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
File "C:\Python27\ArcGISx6410.4\Lib\inspect.py", line 726, in getclasstree
for c in classes:
TypeError: 'geoprocessing describe data object' object is not iterable
>>> inspect.getmro(desc)
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
File "C:\Python27\ArcGISx6410.4\Lib\inspect.py", line 346, in getmro
_searchbases(cls, result)
File "C:\Python27\ArcGISx6410.4\Lib\inspect.py", line 337, in _searchbases
for base in cls.__bases__:
AttributeError: DescribeData: Method __bases__ does not exist
Solved! Go to Solution.
I think arcgisscripting was introduced with ArcGIS 9.0 and describe() has been around since the very beginning. In my mind, arcgisscripting wasn't so much a Python package as much as a Python wrapper for a COM-based package, i.e., arcgisscripting wasn't very Pythonic. The Describe Object has to be one of the least Pythonic objects in the entire ArcPy package, one of the reasons being what you have pointed out about lack of inspection and introspection.
This raises the question of, "why hasn't Esri made the Describe Object more Pythonic with any of the 9+ major releases since ArcGIS 9.0?" Unfortunately, I don't have an answer, not even a bad one. Part of it might have to do with "why fix what isn't broken" and another factor might be the nested hierarchy/inheritance of how the Describe Object works under the hood, but I am just speculating.
Although inspection and introspection of the Describe Object itself is non-existent, the documentation does lay it all out so it isn't a complete guessing game as to which objects support which property. The following code scrapes the Esri documentation to extract all of the property types and properties from the documentation:
>>> import bs4
>>> import urllib2
>>> from collections import Iterable
>>>
>>> site = "http://desktop.arcgis.com"
>>> path = "/en/arcmap/latest/analyze/arcpy-functions/describe.htm"
>>> desc_props = []
>>>
>>> f = urllib2.urlopen(site + path)
>>> soup = bs4.BeautifulSoup(f.read())
>>> seealso = soup.find(class_="seealso bulleted")
>>> seealso_paths = [
... (a.find(text=True), a['href'])
... for a in seealso("a", href=True)
... ]
...
>>>
>>> for type, path in seealso_paths:
... f = urllib2.urlopen(site + path)
... soup = bs4.BeautifulSoup(f.read())
... proptbl = soup.find(class_="arcpyclass_proptbl")
... if proptbl:
... proptbl.thead.extract()
... desc_props += (
... type, tuple(
... row.td.find(text=True)
... for row in proptbl("tr", recursive=False)
... )
... ),
...
>>> # Number of property categories/types
>>> print(len(desc_props))
31
>>> # Total number of properties
>>> print(sum(1 for type, props in desc_props for prop in props))
248
>>>
As you can see, there are 31 property categories or types and 248 properties, although no one object supports/has all 248 properties. As of ArcGIS 10.5, the categories and types from the code above are:
Describe Object Properties: baseName catalogPath children childrenExpanded dataElementType dataType extension file fullPropsRetrieved metadataRetrieved name pathArcInfo Workstation Item: alternateName isIndexed isPseudo isRedefined itemType numberDecimals outputWidth startPosition widthArcInfo Workstation Table: itemSetCAD Drawing Dataset Properties: is2D is3D isAutoCAD isDGNCadastral Fabric Properties: bufferDistanceForAdjustment compiledAccuracyCategory defaultAccuracyCategory maximumShiftThreshold multiGenerationEditing multiLevelReconcile pinAdjustmentBoundary pinAdjustmentPointsWithinBoundary surrogateVersion type version writeAdjustmentVectorsCoverage FeatureClass Properties: featureClassType hasFAT topologyCoverage Properties: tolerancesDataset Properties: canVersion changeTracked datasetType DSID extent isArchived isVersioned MExtent spatialReference ZExtentEditor Tracking Properties: editorTrackingEnabled creatorFieldName createdAtFieldName editorFieldName editedAtFieldName isTimeInUTCFeatureClass Properties: featureType hasM hasZ hasSpatialIndex shapeFieldName shapeTypeGDB FeatureClass Properties: areaFieldName geometryStorage lengthFieldName representations GDB Table Properties: Geometric Network Properties: LAS Dataset Properties: | Layer Properties: dataElement featureClass FIDSet fieldInfo layer nameString table whereClause Mosaic Dataset Properties: Network Analyst: Network Dataset Properties: | Prj File Properties: spatialReferenceRaster Band Properties: height isInteger meanCellHeight meanCellWidth noDataValue pixelType primaryField tableType widthRaster Catalog Properties: rasterFieldNameRaster Dataset Properties: bandCount compressionType format permanent sensorTypeRecordSet and FeatureSet Properties: json pjsonRelationshipClass Properties: backwardPathLabel cardinality classKey destinationClassKeys destinationClassNames forwardPathLabel isAttachmentRelationship isAttributed isComposite isReflexive keyType notification originClassNames originClassKeys relationshipRulesRepresentationClass Properties: overrideFieldName requireShapeOverride ruleIDFieldNameSchematic Diagram Properties: diagramClassNameTable Properties: hasOID OIDFieldName fields indexesTableView Properties: table FIDSet fieldInfo whereClause nameStringTin Properties: fields hasEdgeTagValues hasNodeTagValues hasTriangleTagValues isDelaunay ZFactorTopology Properties: clusterTolerance featureClassNames maximumGeneratedErrorCount ZClusterToleranceWorkspace Properties: connectionProperties connectionString currentRelease domains release workspaceFactoryProgID workspaceType |
Instead of trying to work through all of the rules for which properties apply to what kind of object, I have always found just testing all of them is quite fast. That way, you know exactly which properties apply even if they aren't all documented.
Instead of using a try:except block, just use getattr() with a default value of None. For a file geodatabase polyline feature class, I get 56 properties that return some kind of value.
I agree that it can be difficult to navigate the properties of the Describe object. The properties are dynamic based on the object it is created from. Hardly recommendable, but you could do something like this:
def main():
import arcpy
fc = r'C:\GeoNet\Streets\GeoNet Street Sample.gdb\LebStreetSample'
desc = arcpy.Describe(fc)
atts = ['areaFieldName', 'lengthFieldName', 'datasetType', 'shapeFieldName',
'OIDFieldName', 'meanCellHeight', 'whereClause']
for att in atts:
if hasattr(desc, att):
value = eval('desc.{0}'.format(att))
if value != '':
print att, value
if __name__ == '__main__':
main()
Yeah, that's what I was afraid of. I'll probably end up doing something similar, but I'll use try/except with getattr, instead of eval.
Maybe I'm missing something, but it seems to me that the factory pattern would make more sense than whatever hidden implementation is going on here. They may very well be using the factory pattern on the back-end C classes (I'm assuming that's what they are). But the combination of inconsistent return types and no introspection makes for grotesquely un-Pythonic code.
I think arcgisscripting was introduced with ArcGIS 9.0 and describe() has been around since the very beginning. In my mind, arcgisscripting wasn't so much a Python package as much as a Python wrapper for a COM-based package, i.e., arcgisscripting wasn't very Pythonic. The Describe Object has to be one of the least Pythonic objects in the entire ArcPy package, one of the reasons being what you have pointed out about lack of inspection and introspection.
This raises the question of, "why hasn't Esri made the Describe Object more Pythonic with any of the 9+ major releases since ArcGIS 9.0?" Unfortunately, I don't have an answer, not even a bad one. Part of it might have to do with "why fix what isn't broken" and another factor might be the nested hierarchy/inheritance of how the Describe Object works under the hood, but I am just speculating.
Although inspection and introspection of the Describe Object itself is non-existent, the documentation does lay it all out so it isn't a complete guessing game as to which objects support which property. The following code scrapes the Esri documentation to extract all of the property types and properties from the documentation:
>>> import bs4
>>> import urllib2
>>> from collections import Iterable
>>>
>>> site = "http://desktop.arcgis.com"
>>> path = "/en/arcmap/latest/analyze/arcpy-functions/describe.htm"
>>> desc_props = []
>>>
>>> f = urllib2.urlopen(site + path)
>>> soup = bs4.BeautifulSoup(f.read())
>>> seealso = soup.find(class_="seealso bulleted")
>>> seealso_paths = [
... (a.find(text=True), a['href'])
... for a in seealso("a", href=True)
... ]
...
>>>
>>> for type, path in seealso_paths:
... f = urllib2.urlopen(site + path)
... soup = bs4.BeautifulSoup(f.read())
... proptbl = soup.find(class_="arcpyclass_proptbl")
... if proptbl:
... proptbl.thead.extract()
... desc_props += (
... type, tuple(
... row.td.find(text=True)
... for row in proptbl("tr", recursive=False)
... )
... ),
...
>>> # Number of property categories/types
>>> print(len(desc_props))
31
>>> # Total number of properties
>>> print(sum(1 for type, props in desc_props for prop in props))
248
>>>
As you can see, there are 31 property categories or types and 248 properties, although no one object supports/has all 248 properties. As of ArcGIS 10.5, the categories and types from the code above are:
Describe Object Properties: baseName catalogPath children childrenExpanded dataElementType dataType extension file fullPropsRetrieved metadataRetrieved name pathArcInfo Workstation Item: alternateName isIndexed isPseudo isRedefined itemType numberDecimals outputWidth startPosition widthArcInfo Workstation Table: itemSetCAD Drawing Dataset Properties: is2D is3D isAutoCAD isDGNCadastral Fabric Properties: bufferDistanceForAdjustment compiledAccuracyCategory defaultAccuracyCategory maximumShiftThreshold multiGenerationEditing multiLevelReconcile pinAdjustmentBoundary pinAdjustmentPointsWithinBoundary surrogateVersion type version writeAdjustmentVectorsCoverage FeatureClass Properties: featureClassType hasFAT topologyCoverage Properties: tolerancesDataset Properties: canVersion changeTracked datasetType DSID extent isArchived isVersioned MExtent spatialReference ZExtentEditor Tracking Properties: editorTrackingEnabled creatorFieldName createdAtFieldName editorFieldName editedAtFieldName isTimeInUTCFeatureClass Properties: featureType hasM hasZ hasSpatialIndex shapeFieldName shapeTypeGDB FeatureClass Properties: areaFieldName geometryStorage lengthFieldName representations GDB Table Properties: Geometric Network Properties: LAS Dataset Properties: | Layer Properties: dataElement featureClass FIDSet fieldInfo layer nameString table whereClause Mosaic Dataset Properties: Network Analyst: Network Dataset Properties: | Prj File Properties: spatialReferenceRaster Band Properties: height isInteger meanCellHeight meanCellWidth noDataValue pixelType primaryField tableType widthRaster Catalog Properties: rasterFieldNameRaster Dataset Properties: bandCount compressionType format permanent sensorTypeRecordSet and FeatureSet Properties: json pjsonRelationshipClass Properties: backwardPathLabel cardinality classKey destinationClassKeys destinationClassNames forwardPathLabel isAttachmentRelationship isAttributed isComposite isReflexive keyType notification originClassNames originClassKeys relationshipRulesRepresentationClass Properties: overrideFieldName requireShapeOverride ruleIDFieldNameSchematic Diagram Properties: diagramClassNameTable Properties: hasOID OIDFieldName fields indexesTableView Properties: table FIDSet fieldInfo whereClause nameStringTin Properties: fields hasEdgeTagValues hasNodeTagValues hasTriangleTagValues isDelaunay ZFactorTopology Properties: clusterTolerance featureClassNames maximumGeneratedErrorCount ZClusterToleranceWorkspace Properties: connectionProperties connectionString currentRelease domains release workspaceFactoryProgID workspaceType |
Instead of trying to work through all of the rules for which properties apply to what kind of object, I have always found just testing all of them is quite fast. That way, you know exactly which properties apply even if they aren't all documented.
Instead of using a try:except block, just use getattr() with a default value of None. For a file geodatabase polyline feature class, I get 56 properties that return some kind of value.
ArcGIS Pro 2.0 includes a new Describe method in the ArcPy Data Access module that returns all describe properties in a Python dictionary.
What's new in ArcGIS Pro 2.0—ArcGIS Pro | ArcGIS Desktop
Python
- A new arcpy.da.Describe function was added for describing data. It is similar to the arcpy.Describe function but returns its information as a Python dictionary.
I haven't heard whether this will be back-ported to ArcMap.
Is it possible that the the new is slower than the old one? May be because the new one is based on dictionaries which are filled right away, and the old one "lazy loads" property data?
I have noticed no difference in speed or issues when working with locally stored data and Pro.
Is not that I have to wait a long time, but check the execution time of both lines when executed separately in ArcGIS Pro 2.0 python window:
arcpy.Describe('data') arcpy.da.Describe('data')
Which one is a bit faster? 🙂 The second one has the "running progress dots", while the first one returns the describe object at once.
I could see it going either way. On one hand, enumerating all the properties and populating a dictionary will take more time; but on the other hand, maybe the internal code within the Data Access module is faster. The DA cursors are much faster than the old/original cursors because the back-end code was optimized.
It is hard to say without some testing, fortunately we can test:
>>> import arcpy
>>> import timeit
>>>
>>> fc = r'D:\transfer\geodata\Default.gdb\NHDWaterBody'
>>> timeit.timeit(lambda: arcpy.Describe(fc), number=1000)
72.3306900028995
>>> timeit.timeit(lambda: arcpy.da.Describe(fc), number=1000)
205.41314557604585
>>>
So, it turns out there is a cost to enumerating all of the properties and populating the dictionary. Since ArcPy Describe lazily evaluates properties, the test above isn't quite apples to apples, and I suspect the results will get closer the more properties you access.
Does 0.07 vs 0.21 seconds make a difference in your code when instantiating the a Describe object? I could see some situations where it would, but I think in most cases the impact is negligible and the added benefits of having the dictionary populated with all the properties far outweighs any performance difference.
like I said... not noticeable speed difference... unless people sip coffee faster than I do
Plus the dictionary is easier to work with
d = arcpy.da.Describe(in_fc2)
d.keys()
Out[20]: dict_keys(['datasetType', 'children', 'hasM', 'FIDSet', 'extent',
'metadataRetrieved', 'name', 'hasGlobalID', 'dataElementType', 'isVersioned',
'representations', 'catalogPath', 'modelName', 'isCOGOEnabled', 'editorFieldName',
'areaFieldName', 'createdAtFieldName', 'changeTracked', 'extensionProperties',
'ZExtent', 'featureType', 'fields', 'fullPropsRetrieved', 'OIDFieldName', 'file',
'creatorFieldName', 'versionedView', 'indexes', 'childrenExpanded', 'rasterFieldName',
'canVersion', 'geometryStorage', 'relationshipClassNames', 'lengthFieldName',
'defaultSubtypeCode', 'hasZ', 'shapeFieldName', 'shapeType', 'aliasName', 'dataType',
'baseName', 'DSID', 'globalIDFieldName', 'extension', 'hasOID', 'MExtent', 'path',
'isTimeInUTC', 'spatialReference', 'editorTrackingEnabled', 'editedAtFieldName',
'subtypeFieldName', 'hasSpatialIndex'])
And timing results are variable, depending on what you are timing and how
import arcpy
%timeit(arcpy.Describe(in_fc2))
97.6 ms ± 4.61 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit(arcpy.da.Describe(in_fc2))
249 ms ± 3.35 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)