Skip navigation
All People > Dan_Patterson > Py... blog > 2017 > July
2017

Geometry...

That is what I am normally interested in.  The baggage (aka, attributes) tag along for the ride and I normally find it easier to keep the baggage separate until I am done with the geometry. For those following along, see my previous post.

 

Let us compare some of the ways that we can pull the geometry out of a featureclass.  The following demonstrations can be followed in your own workflow for testing your specific cases.

 

Imports first ___________________________________________________________________________________

import sys
import numpy as np
import arcpy
import arcgis

_______________________________________________________________________________________________

I prefer to import modules in the order of less polluting first to ensure namespace is grabbed by what I want in order of importance in case there is any conflict during import.

 

Searchcursor __________________________________________________________________________________

# ---- pick a featureclass ----
fc0 = r'drive:\your_spaceless_folder_path\your_geodatabase.gdb\polygon'

# ---- using arcpy.Describe ----
desc = arcpy.Describe(fc0)
shp = desc.shapeFieldName    # ---- using dot notation ----

# ---- using arcpy.da.Describe ----
desc = arcpy.da.Describe(fc0)
shp = desc['shapeFieldName'] # ---- note... now a dictionary ----

# ---- geometry, as X, Y ----
flds = [shp + "@"]
shps = [row[0] for row in arcpy.da.SearchCursor(fc0, flds)]

_______________________________________________________________________________________________

 

Which of course you can begin to use to get basic properties.  In this example,

Geometry properties  ____________________________________________________________________________

for s in shps:
    print(s.type, s.partCount, s.pointCount, s.length, s.area)
polygon 1 5 40.0 100.0
polygon 1 10 64.0 64.0
polygon 2 14 99.41640786499875 182.0

_______________________________________________________________________________________________

 

Then you can get to work and convert to a NumPy array quickly and simply.  Since I know this is a polygon featureclass, it only takes a couple of lines to perform the task.

 

Get to the point(s) ______________________________________________________________________________

pnts = []
for shp in shps:
    for j in range(shp.partCount):
        pt = shp.getPart(j)
        p_list = [(pnt.X, pnt.Y) for pnt in pt if pnt]
        pnts.extend(p_list)
dt = [('Xs', '<f8'), ('Ys', '<f8')]
a = np.asarray(pnts, dtype=dt)

_______________________________________________________________________________________________

The basic difference between the array forms is how you want to work with them.

In the example above array 'a' has a specified data type (dtype).  The fields/columns of the array can be accessed by name (Xs and Ys).  Since this array is a structured array, the access would be a['Xs'] and a['Ys'].  If I converted this to a record array, one could use object.field notation.

 

The coordinates are nothing more than the same type of number, so the field names can be dispensed with altogether.  In this case, the sentient being is responsible for knowing what they are working with.  Both forms of the same data are shown below

_______________________________________________________________________________________________

a # a structured array
array([(300020.0, 5000000.0), (300010.0, 5000000.0), (300010.0, 5000010.0), ...,
          (300002.0, 5000002.0), (300008.0, 5000002.0), (300005.0, 5000008.0)],
      dtype=[('Xs', '<f8'), code="" ('ys',="" '<f8')])

a_nd = np.asarray(pnts)  # as an np.ndarray
array([[ 300020.00,  5000000.00],
       [ 300010.00,  5000000.00],
       [ 300010.00,  5000010.00],
       ...,
       [ 300002.00,  5000002.00],
       [ 300008.00,  5000002.00],
       [ 300005.00,  5000008.00]])
a_nd.dtype
dtype('float64')

_______________________________________________________________________________________________

 

The following demo function can be used on your data to examine some of the options and explore some of the properties and methods available to each.  Just don't forget the imports at the start of the post

 

The demo  _____________________________________________________________________________________

def _demo(in_fc):
    """Run the demo and return some objects
    : create a SpatialDataFrame class object from a featureclass
    : create a record array from a_sdf
    : create a numpy.ndarray using the da module
    """

    SR = arcpy.Describe(in_fc).SpatialReference

    sdf = arcgis.SpatialDataFrame
    a_sdf = sdf.from_featureclass(in_fc,
                                  sql_clause=None,
                                  where_clause=None,
                                  sr=SR)
    a_rec = sdf.to_records(a_sdf)  # record array
    a_np = arcpy.da.FeatureClassToNumPyArray(in_fc,
                                             field_names="*",
                                             spatial_reference=SR,
                                             explode_to_points=True)
    a_np2 = fc_g(in_fc)
    return sdf, a_sdf, a_rec, a_np, a_np2

 

Now... behind all the magic of the above options, a searchcursor is behind the acquisition of all of the options shown above.  The options do, however, provide access to methods and properties which are unique to their data class.  In many instances these are shared.

 

Here is what the 'look like' .  In the case of the SpatialDataFrame, a_sdf, the same when converted to a record array and the conventional da.FeatureClassToNumPyArray... all fields were included.  in the last option, a_np2, just the geometry is returned as demonstrated in the above examples, with the addition of handling the geometry parts and point when the polygon is 'exploded' to it's constituent points.

 

a_sdf
Out[40]:
   OBJECTID  Id Text_fld                                              SHAPE
0         1   1     None  {'rings': [[[300020, 5000000], [300010, 500000...
1         2   2        C  {'rings': [[[300010, 5000020], [300010, 500001...
2         3   3        A  {'rings': [[[300010, 5000010], [300010, 500002...

a_rec
Out[41]:
rec.array([ (0, 1, 1, None, {"rings": [[[300020, 5000000], [300010, 5000000], [300010, 5000010], [300020, 5000010], [300020, 5000000]]], "spatialReference": {"latestWkid": 2951, "wkid": 2146}}),
(1, 2, 2, 'C', {"rings": [[[300010, 5000020], [300010, 5000010], [300000, 5000010], [300000, 5000020], [300010, 5000020]], [[300002, 5000018], [300002, 5000012], [300008, 5000012], [300008, 5000018], [300002, 5000018]]], "spatialReference": {"latestWkid": 2951, "wkid": 2146}}),
(2, 3, 3, 'A', {"rings": [[[300010, 5000010], [300010, 5000020], [300020, 5000020], [300020, 5000010], [300010, 5000010]], [[300010, 5000010], [300010, 5000000], [300000, 5000000], [300000, 5000010], [300010, 5000010]], [[300005, 5000008], [300002, 5000002], [300008, 5000002], [300005, 5000008]]], "spatialReference": {"latestWkid": 2951, "wkid": 2146}})],
          dtype=[('index', '<i8'), ('OBJECTID', '<i8'), ('Id', '<i8'), ('Text_fld', 'O'), ('SHAPE', 'O')])

a_np
Out[42]:
array([(1, [300020.0, 5000000.0], 1, 'None', 40.0, 100.0),
       (1, [300010.0, 5000000.0], 1, 'None', 40.0, 100.0),
       (1, [300010.0, 5000010.0], 1, 'None', 40.0, 100.0), ...,
       (3, [300002.0, 5000002.0], 3, 'A', 99.41640786499875, 182.0),
       (3, [300008.0, 5000002.0], 3, 'A', 99.41640786499875, 182.0),
       (3, [300005.0, 5000008.0], 3, 'A', 99.41640786499875, 182.0)],
      dtype=[('OBJECTID', '<i4'), ('Shape', '<f8', (2,)), ('Id', '<i4'), ('Text_fld', '<U255'), ('Shape_Length', '<f8'), ('Shape_Area', '<f8')])

a_np2
Out[43]:
array([(1, 0, 300020.0, 5000000.0), (1, 0, 300010.0, 5000000.0),
       (1, 0, 300010.0, 5000010.0), ..., (3, 1, 300002.0, 5000002.0),
       (3, 1, 300008.0, 5000002.0), (3, 1, 300005.0, 5000008.0)],
      dtype=[('ID_num', '<i4'), ('Part_num', '<i4'), ('Xs', '<f8'), ('Ys', '<f8')])

 

When the SpatialDataFrame is converted to a numpy record array, the geometry field (Shape) has a dtype of 'O', which is an object array.  This will be the 'normal' case since it is unlikely that all polygons will contain the same number of parts and points per part.  To truly be a numpy array, the 'shape' of the array needs to be consistent.  

 

The latter two representations, (a_np and a_np2) deal with this but converting the polygons to points.  These points can be converted back to polygons after they are used in some process such as moving, calculating parameters, reprojecting.

 

Next installation... working with geometry in its various representations.

ArcGIS API for Python  ... version 1.2.0

Another cleverly named product to provide more clarity to the other named kindred... ArcGIS for Desktop, ArcGIS Pro, Arc... etcetera.  The link to the help is here.  The ability to work with Jupyter notebooks, NumPy, SciPy and Arcpy is touted and welcomed (...and there is something about web mapping and stuff as well).

 

Where stuff is

Locating ArcGIS in your installation path, depends on how you installed it... for a single user (aka no sharesies) or for anyone.  This describes the single user situation.

To begin, import the module and check its __path__ and __file__ property.  My installation path is in an ArcPro folder that I created... yours will differ, but beyond the base folder, everything should be the same.

 

Basic import______________________________________________________________________

In [1]: import arcgis
In [2]: arcgis.__path__
Out[2]: ['C:\\ArcPro\\bin\\Python\\envs\\arcgispro-py3\\lib\\site-packages\\arcgis']

In [3]: arcgis.__file__

Out[3]: 'C:\\ArcPro\\bin\\Python\\envs\\arcgispro-py3\\lib\\site-packages\\arcgis\\__init__.py'

_________________________________________________________________________________

 

The __init__.py file is read during the import and a whole load of imports are done to cover access to pretty well all available modules contained in the ArcGIS path.

 

If you only want to work with the geometry or Pandas related functionality, you can import the SpatialDataFrame class directly.

 

SpatialDataFrame___________________________________________________________________

In [1]: from arcgis import SpatialDataFrame as sdf

# ---- or ----

In [1]: from arcgis.features._data.geodataset import SpatialDataFrame as sdf

In [2]: sdf.__module__  # to confirm the path

Out[2]: 'arcgis.features._data.geodataset.geodataframe'

_________________________________________________________________________________

 

The SpatialDataFrame class is pretty well all that is in geodataframe.py script.

Another useful call within that class is one to from_featureclass and to_featureclass which can be obtained by a variety of other imports or by a direct call to the io module's fileops.py

 

Featureclass access________________________________________________________________

In [3]: from arcgis.features._data.geodataset.io import from_featureclass, to_featureclass

# ---- or ----

In [4] from arcgis.features._data.geodataset import io

_________________________________________________________________________________

 

If you prefer the module.function method of calling, the option in ln [4] can be used to convert a featureclass to a SpatialDataFrame.

 

The Reveal________________________________________________________________________

In [5] fc0 = r'drive:\your_space_free_path\your_geodatabase.gdb\polygon'

In [6] a = io.from_featureclass(fc0)

In [7] print("\nSpatialDataFrame 'a' \n{}".format(a))  # ---- the sdf contents ----

 

SpatialDataFrame 'a'
   OBJECTID  Id Text_fld                                              SHAPE
0         1   1     None  {'rings': [[[300020, 5000000], [300010, 500000...
1         2   2        C  {'rings': [[[300010, 5000020], [300010, 500001...
2         3   3        A  {'rings': [[[300010, 5000010], [300010, 500002...

ln [8] type(a)

Out[8]: <class 'arcgis.features._data.geodataset.geodataframe.spatialdataframe'>

_________________________________________________________________________________

 

Behind the Scenes

Great so far... BUT when you delve deeper into what is going on under the hood, you will find out that the from_featureclass method...

  1. imports the SpatialDataFrame class
  2. uses arcpy.da.SearchCursor to convert the featureclass into lists of
    1. values (non-geometry fields)
    2. geometry (geometry field)
  3. the value fields are then converted to a pandas dataframe (pd.DataFrame)
  4. the dataframe in step 3 is then used with the geometry values to produce a SpatialDataFrame modelled after a Pandas dataframe.

 

Final Comments

So... in conclusion, an arcpy.da searchcursor is used to get all the necessary geometry and attribute data then ultimately to a SpatialDataFrame ... which is like a Pandas dataframe ... but it is geometry enabled.

Sort of like geopandas... (geopandas on GitHub) but relying on arc* functionality and code for the most part.

 

More to Come

There will be more posts as I have time... the next post... Geometry.... part II

Requirements:  You need to have ArcGIS Pro installed and the associated installation of jupyter notebooks as described in previous blogs. 

 

This is a novel way of presenting all things python, arcpy and eventually through the Python API for ArcGIS Pro...

 

Background link   Arcgis Pro 2... Jupyter Notebook Setup Basics

 

The link here...distance.pdf .... should open up the explanatory pdf for the accompanying jupyter notebook.  If not... click the attachment below.

 

Unfortunately, Jive isn't at the stage to allow for this type of interactive content.

 

If you have comments, questions or found a bug, send them to me via email.

In the last blog post... ArcGIS Pro 2... Creating Desktop Shortcuts .. I showed how to create desktop shortcuts to some of the more commonly used applications within esri's distribution of python.  In this post, the Jupyter Notebook will be addressed.

 

Before you start on this venture, make sure that you have read up on notebooks and see if they are for you and your workflow and not something that '... everyone is doing it, and so should I.... '.  They do have their place, but they require maintenance.

 

The first order of business.... 

  • Create a folder location to store your notebooks... it is just way easier to keep them in one location and distribute them from there.  I chose to put them in my local machine's GitHub folder, in a separate folder within. ... C:\GitDan\JupyterNoteBooks ... pretty clever ehhh?
  • Right-click on the file ---- "C:/ArcPro/bin/Python/envs/arcgispro-py3/Scripts/jupyter-notebook-script.py" ---- and select create shortcut  which will simply create a shortcut to that file.
    • Go to the Shortcut tab and edit it the Target line and put ...
    • ---- C:\ArcPro\bin\Python\envs\arcgispro-py3\pythonw.exe ---- in front what is there... this will yield the whole path to python in the Pro distribution, plus the script to run (yes, the paths are that long and cryptic).
    • C:\ArcPro\bin\Python\envs\arcgispro-py3\pythonw.exe "C:/ArcPro/bin/Python/envs/arcgispro-py3/Scripts/jupyter-notebook-script.py"
  • In the Start in: line, put the path to the folder that you are going to house your notebooks.  In my example, this was the folder ---- C:\Git_Dan\JupyterNoteBooks ----
  • Finally, right-click on the shortcut, select Copy, flip over to your Desktop and Paste it there.  
  • yes... I know you can go to the command interface and run the command line from there, but why.  You can also use Anaconda Navigator in other non-ArcGIS Pro environments.  The installation and setup of the application within the Pro environment isn't worth the extra effort.

 

 

Python runs that script, which imports notebook.notebookapp as shown below.  That import does the work of setting up notebooks to work in your target folder.

""" Source code of ---- jupyter-notebook-script.py ---- located in
:  _drive_:\_folder_\bin\Python\envs\arcgispro-py3\Scripts ... where _drive_ and _folder
:  are unique to your machine... for example
:  ---- C:\ArcPro\bin\Python\envs\arcgispro-py3\Scripts ----
"""

if __name__ == '__main__':
    import sys
    import notebook.notebookapp

    sys.exit(notebook.notebookapp.main())

 

The details aren't really relevant... Just click on the shortcut, create your notebook and begin sharing... but before you do that, begin your reading here

 

And from a recent post... perhaps the future might look like this...

 

Here are a few tips for creating desktop shortcuts to make working with Pro a tad easier.

Installation of packages is given in pictoral form at the end of the blog (I sort of forgot... )

 

I like to have stuff easily accessed. Desktop icons work for me. Here is an example of having the pro.bat, Spyder and Jupyter IDE icons created and organized in one spot.

 

 

Launch File Explorer and find your installation folder. In my case, I installed Pro in a folder with the same name.  The rest of your folder structure will be the same after this point. Navigate to the jupyter-qtconsole-script.py script within the path (see 1 and 2 in the figure)

 

 

Once the properties dialog is open, click on Shortcut, then ensure your path emulates the example below.  It consists of two parts... the first is the location of python.exe within the installation path... the second is the aforementioned script which is run and noted above.  I make sure that the Start in: is specified as well.

 

 

 

If you work on your machine solely, you can click on the Advanced button in the previous dialog and toggle off the Run as administrator checkbox.  This will just avoid you having to answer a security question when the shortcut is used.

 

And now you are done.

 

I created one for proenv.bat as well as Spyder for ArcGIS Pro since I use different versions of Spyder and python and separate conda distributions, so I keep different shortcuts.

Here are the equivalent options for the Spyder shortcut.

Target...  (pythonw.exe location and the spyder-script.py location)

         C:\ArcPro\bin\Python\envs\arcgispro-py3\pythonw.exe "C:/ArcPro/bin/Python/envs/arcgispro-py3/Scripts/spyder-script.py"

 Starts in....
     C:\ArcPro\bin\Python\envs\arcgispro-py3\Scripts
Now you can fire up that puppy and put it to some useless work.
So that is the Jupyter QtConsol... on to Spyder and Pythonwin later....
Enjoy
----------------------------------------------------------------------------------------------------------------------------------
A pictoral guide to installing packages in Pro
Get to the package manager
To get there, look south-west
Install, update and add.  If you miss anything, just go back and add it.
That is about it.