The interlude... An example of searchcursors and their array rootsCurses at cursors... pretty well a regular occurrence on this site. People love them and love to hate them.
They try to nest them, and nest them within 'for' loops and 'with' statements with calls that sound poetic ( ... for row in rows, with while for arc thou ... )
There are old cursors and new cursors (the da-less and the da cursors). Cursors appear in other guises such as the new, and cleverly named, 'arcgis' module (digression # 1 ... really? something else with arcgis in it! Who is in charge of branding).
Perhaps cursors are cloaked, in other arcpy and data access module methods (ie. blank-to-NumPyArray and NumPyArray-to-blank). Who knows for sure since much is locked in arcgisscripting.pyd.
Sadly, we deal in a work of mixed data types. Our tables contain columns of attributes, organized sequentially by rows. Sometimes the row order has meaning, sometimes not. Each column contains one data type in a well-formed data structure. This is why spreadsheets are purely evil for trying to create and maintain data structure, order and form (you can put anything anywhere).
The interlude... An example of searchcursors and their array roots |
---|
|
If someone can explain why the plain searchcursor is slower than its dressed up (or down?) counterparts, I would love to hear about it .
Back to the main event
Harkening back to the fields of mathematics, arrays are assemblages of data, in 1, 2, 3 or more dimensions. If an array of any dimension has a uniform data type, then life is easier from a structural and usage perspective (this is one reason why Remote Sensing is easier than GIS ... bring on the mail). We need to maintain an index which ties our geometry to our attributes so what goes where, and where is what, doesn't get mixed up (digression # 2... I am sure this isn't what the branders meant by The Science of Where but we can only hope)
Nerd Stuff
Enough with the boring stuff... bring on the code.
Some of this has been looked from a slightly different perspective in
Get to the Points... arcpy, numpy, pandas
desc = arcpy.da.Describe(in_fc)
sk = sorted([k for k in desc.keys()]) # sorted keys
kv = [(k, desc) for k in sk] # key/value pairs
kv = "\n".join(["{!s:<20} {}".format(k, desc) for k in sk])
With appropriate snips in the full list
[..., 'MExtent', 'OIDFieldName', 'ZExtent', 'aliasName', 'areaFieldName',
'baseName', ... 'catalogPath', ... 'dataElementType', 'dataType',
'datasetType', ... 'extent', 'featureType', 'fields', 'file', ... 'hasM',
'hasOID', 'hasSpatialIndex', 'hasZ', 'indexes', ... 'lengthFieldName',
... 'name', 'path', ... 'shapeFieldName', 'shapeType', 'spatialReference',
...]
SR = desc['spatialReference'] # Get the search cursor object.
flds = "*"
args = [in_fc, flds, None, SR, True, (None, None)]
cur = arcpy.da.SearchCursor(*args)
dir(cur)
['__class__', '__delattr__', '__dir__', '__doc__', '__enter__', '__eq__',
'__esri_toolinfo__', '__exit__', '__format__', '__ge__', '__getattribute__',
'__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__',
'__lt__', '__ne__', '__new__', '__next__', '__reduce__', '__reduce_ex__',
'__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__',
'_as_narray', '_dtype', 'fields', 'next', 'reset']
cur.__class__
<class 'da.SearchCursor'>
cur.__class__.__mro__
(<class 'da.SearchCursor'>, <class 'object'>)
s0 = set(dir(mro[0]))
s1 = set(dir(mro[1]))
sorted(list(set.difference(s0, s1)))
['__enter__', '__esri_toolinfo__', '__exit__', '__getitem__', '__iter__',
'__next__', '_as_narray', '_dtype', 'fields', 'next', 'reset']
cur.__esri_toolinfo__
['FeatureLayer|Table|TableView|Dataset|FeatureDataset::::', 'String::*::',
'Python::None::', 'CoordinateSystem::::']
type(cur._as_narray())
<class 'numpy.ndarray'>
cur._as_narray().__array_interface__
{'version': 3,
'strides': None,
'shape': (0,),
'typestr': '|V36',
'descr': [('OBJECTID', '<i4'),
('Shape', '<f8', (2,)),
('Shape_Length', '<f8'),
('Shape_Area', '<f8')],
'data': (2044504703824, False)}
cur.fields
('OBJECTID', 'Shape', 'Shape_Length', 'Shape_Area')
cur._dtype
dtype([('OBJECTID', '<i4'),
('Shape', '<f8', (2,)),
('Shape_Length', '<f8'),
('Shape_Area', '<f8')])
cur._as_narray() # try to get its properties, all we get is the dtype
array([],
dtype=[('OBJECTID', '<i4'), ('Shape', '<f8', (2,)),
('Shape_Length', '<f8'), ('Shape_Area', '<f8')])
cur.reset()
cur._as_narray() # reset to the beginning
array([(1, [342000.0, 5022000.0], 4000.0, 1000000.0),
(1, [342000.0, 5023000.0], 4000.0, 1000000.0),
(1, [343000.0, 5023000.0], 4000.0, 1000000.0),
(1, [343000.0, 5022000.0], 4000.0, 1000000.0),
(1, [342000.0, 5022000.0], 4000.0, 1000000.0)],
dtype=[('OBJECTID', '<i4'), ('Shape', '<f8', (2,)),
('Shape_Length', '<f8'), ('Shape_Area', '<f8')])
cur.reset()
for row in cur:
print(("{} "*len(row)).format(*row)) # print individual elements
1 (342000.0, 5022000.0) 4000.0 1000000.0
1 (342000.0, 5023000.0) 4000.0 1000000.0
1 (343000.0, 5023000.0) 4000.0 1000000.0
1 (343000.0, 5022000.0) 4000.0 1000000.0
1 (342000.0, 5022000.0) 4000.0 1000000.0
cur.reset()
for row in cur:
print(row) # print the whole row as a tuple
(1, (342000.0, 5022000.0), 4000.0, 1000000.0)
(1, (342000.0, 5023000.0), 4000.0, 1000000.0)
(1, (343000.0, 5023000.0), 4000.0, 1000000.0)
(1, (343000.0, 5022000.0), 4000.0, 1000000.0)
(1, (342000.0, 5022000.0), 4000.0, 1000000.0)
cur.reset()
list(cur)
[(1, (342000.0, 5022000.0), 4000.0, 1000000.0),
(1, (342000.0, 5023000.0), 4000.0, 1000000.0),
(1, (343000.0, 5023000.0), 4000.0, 1000000.0),
(1, (343000.0, 5022000.0), 4000.0, 1000000.0),
(1, (342000.0, 5022000.0), 4000.0, 1000000.0)]
cur.reset()
dt = cur._dtype
c_lst = list(cur)
np.asarray(c_lst, dtype=dt)
array([(1, [342000.0, 5022000.0], 4000.0, 1000000.0),
(1, [342000.0, 5023000.0], 4000.0, 1000000.0),
(1, [343000.0, 5023000.0], 4000.0, 1000000.0),
(1, [343000.0, 5022000.0], 4000.0, 1000000.0),
(1, [342000.0, 5022000.0], 4000.0, 1000000.0)],
dtype=[('OBJECTID', '<i4'), ('Shape', '<f8', (2,)),
('Shape_Length', '<f8'), ('Shape_Area', '<f8')])
a = a.view(np.recarray)
a.Shape == a['Shape'] # check to see if slicing equals dot notation
array([[ True, True],
[ True, True],
[ True, True],
[ True, True],
[ True, True]], dtype=bool)
np.all(a.Shape == a['Shape'])
True
a.Shape # or a['Shape']
array([[ 342000., 5022000.],
[ 342000., 5023000.],
[ 343000., 5023000.],
[ 343000., 5022000.],
[ 342000., 5022000.]])
pnts = a.Shape[:-1] # get the unique points
cent = pnts.mean(axis=0) # return the mean by column
cent array([ 342500., 5022500.])
import arraytools as art
art.e_dist(cent, pnts)
array([ 707.11, 707.11, 707.11, 707.11])
poly = a.Shape
art.e_leng(poly) # method to return polygon perimeter/length, total, then by segment
(4000.0, [array([[ 1000., 1000., 1000., 1000.]])])
art.e_area(poly)
1000000.0
in_flds = ['OID@', 'SHAPE@X', 'SHAPE@Y', 'Int_fld', 'Float_fld', 'String_fld']
When using the above notation, the position of the fields is used to reference their values. So you may see code that uses ' for row in cursor ' with row[0] being the feature object id (OID@) and row[3] being the value from an integer field (Int_fld). If you are like me, anything beyond 2, means you are finger counting remembering the python counting is zero-based. I now prefer to spend the extra time assigning variable names rather than using positional notation. You can see this in lines 12-13 below.
in_fc = r'C:\Folder\path_to\A_Geodatabase.gdb\FeatureClass # or Table
desc = arcpy.Describe(in_fc)
SR = desc.spatialReference
in_flds = ['OID@', 'SHAPE@X', 'SHAPE@Y', 'Int_fld', 'Float_fld', 'String_fld']
where_clause = None
spatial_reference = SR
explode_to_points = True
sql_clause = (None, None)
results = []
with arcpy.da.SearchCursor(in_tbl, in_flds) as cursor:
for id, x, y, i_val, f_val, s_val in cursor:
if id > 10:
do stuff
else:
do other stuff
results.append(... put the stuff here ...)
return results
Other discussions
----------
https://community.esri.com/docs/DOC-10416-are-searchcursors-brutally-slow-they-need-not-be
----------
More later...
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.