Should Esri ship third-party modules (like scipy) with ArcGIS?

8224
33
08-08-2012 10:52 AM
curtvprice
MVP Esteemed Contributor
The ArcGIS Python 2.7 distribution for 10.1 includes (in addition to the python standard library)  the user-contributed modules numpy and matplotlib. It would be really great to include some more modules. If you agree, vote this item up and add your favorites (and use cases) in the comments.

Vote here, and also feel free to vote the idea up:

ideas.esri.com: Ship additional python extensions
Tags (2)
0 Kudos
33 Replies
MathewCoyle
Frequent Contributor
pywin32 specifically the win32api.

I like to use it for quasi OS authentication by accessing active directory user names and permissions. As a custom install right now it is a hassle to package it with the standard deployment.
0 Kudos
JasonScheirer
Occasional Contributor III
I will be visiting this thread fairly often to keep up-to-date on what you all think. We on the Python team want you to be happy and producting using Python on the ArcGIS stack.

Please keep in mind that we can't include everything requested as there is a lot of work in testing and integration for each library ([Windows|Linux]x[32 bit|64 bit]x[Many other factors]). For example, I had to help with a workaround for a memory leak in matplotlib that was causing many spatial statistics tools to leak huge amounts of memory. And what would pywin32 look like on Linux? Adding a third-party module means adding an additional dependency in ArcGIS on that third-party which may or may not update regularly or anywhere in sync with our own timetables. Our release dates and the release dates of the third party can result in scares for us: numpy didn't have support for Python 2.7 until a week before 10.1 beta's cutoff date, so we almost had to stick with 2.6 for 10.1. I can only imagine how much closer (or over?) we would have run if we had a scipy dependency on top of that. And the opposite is true, too: we started with upgrading to 2.7 in our daily builds when it was first released, updated and tested 10.1 to Python 2.7.2 during the development cycle and then 2.7.3 came out right before 10.1's release date. I can see one or two libraries being frozen at a version you don't like (maybe they're crashy or have a bug that only existed for a few builds or have awkward APIs which are later fixed) on every install of ArcGIS, and fear that could cause more pain than the hassle of shipping your own with your tools. Not to mention the entirely unknown quantity of not just porting ourselves to Python 3, but waiting for all our third-party dependencies to port as well.

There's also reduplication of functionality in a lot of these libraries. I agree that we should get our GDAL bindings actually shipping and not just on the resource center. But on the other side, there's no realistic use case for Shapely considering ArcGIS already handles geometry in a very flexible, intelligent way. The only missing part is that it needs to be provided to you in Python. We exposed the relational operators in 10.0 on geometry objects, we added topological operators in 10.1, and 10.1 SP1 will ship with improved interoperability with WKT and Esri JSON. Another example is with time zones. I like pytz and I'm used to using it, but ArcGIS ships with its own time zone database and its own way of doing things with the time-aware stuff that came with 10.0, so instead of bundling pytz with its own parallel time zone database I went ahead and integrated the ArcGIS way of doing things as arcpy.time, which had the additional benefit of the smart time delta class that uses ArcGIS' time APIs. This then also negated any specific need for dateutil, which is another library I really like. But arcpy.time handles standard Python datetime and tzinfo objects so it all interoperates just fine with everything else, and the skills you learn either from arcpy or pytz/dateutil transfer without a hitch.

Again, I will check back here frequently. Let me know what you think should be considered absolutely essential to Python to include in the ArcGIS stack going forward.
curtvprice
MVP Esteemed Contributor
Thanks Jason. I can see where you could get into a lot of trouble with this, but on the other hand I'm pretty comfortable with suffering with issues with 3rd party libraries that you don't use.

Why Shapely? I've been told that Shapely can do certain things much faster (not having to interact with the ArcGIS system) which is why I included it. This is probably a big reason people are looking to have GDAL available - although the functionality may be there in the Spatial Analyst toolbox, there may be big performance gains using GDAL if integration with the ArcGIS environment (validation, projection, mask etc) is not needed for a geoprocessing use case.
0 Kudos
KimOllivier
Occasional Contributor III
Shipping on the CD is not very helpful.

Do you really mean that these modules should be installed with Python?

I have a current problem writing some tools to generate Excel spreadsheets as final reports. Easy to do using the win32com module that comes with win32all or Pythonwin that is included with the standard install but not installed with ArcGIS and Python.

The result is that scripts will work in a development environment but cannot be deployed because the modules are not organisation wide.

Is every IT administration department as difficult to deal with as my students encounter?
They are very unhelpful installing service patches, python modules, customising Window to include file extensions, and dozens of other settings that impact on ArcGIS users. I suggest various system default settings to optimise ArcGIS and they just roll their eyes.

A very serious performance hit is placing the default.gdb in the user's profile area which is not local, but across the network.

So rather than worry about some optional and less used modules, I would rather see the defaults loaded with the software to tackle intransigent and unskilled administrators. I suppose they can always reverse them to 'comply with Policy'.

As to having the lowest common denominator in Linux not having win32 capability? That is not an argument for not having it for the majority that do use Windows applications. After all ArcMap only works on Windows, by that argument we should not have that because it does not run on Linux.
0 Kudos
MathewCoyle
Frequent Contributor

Please keep in mind that we can't include everything requested as there is a lot of work in testing and integration for each library ([Windows|Linux]x[32 bit|64 bit]x[Many other factors]).


Completely understandable, and they don't necessarily have to be default installs. It would be nice to have them as options where it is clear to the user/installer that these are not directly supported, third party modules, and to use at your own risk. It would also be nice to know (I haven't seen any resource for this yet) examples of third party modules and use cases popular internally with Esri, even if they are not standard or supported.
0 Kudos
KimOllivier
Occasional Contributor III
Another example is with time zones. I like pytz and I'm used to using it, but ArcGIS ships with its own time zone database and its own way of doing things with the time-aware stuff that came with 10.0, so instead of bundling pytz with its own parallel time zone database I went ahead and integrated the ArcGIS way of doing things as arcpy.time, which had the additional benefit of the smart time delta class that uses ArcGIS' time APIs. This then also negated any specific need for dateutil, which is another library I really like. But arcpy.time handles standard Python datetime and tzinfo objects so it all interoperates just fine with everything else, and the skills you learn either from arcpy or pytz/dateutil transfer without a hitch.


When I backported GPX to Features in 10.1 I found that the time conversion tools do not work even at SP5 so I have to go back to dateutil.
0 Kudos
curtvprice
MVP Esteemed Contributor
A very serious performance hit is placing the default.gdb in the user's profile area which is not local, but across the network.


Kim, this is a bit off-topic, but there is a fix for this in 10.0 SP5 and 10.1.

HowTo:  Set the default Home folder and geodatabase location for new map documents

Of course this requires getting SP5 installed!

In ArcMap 10.1 this setting is exposed in the options button on the Catalog window.
0 Kudos
MathewCoyle
Frequent Contributor
There is a fix for this in 10.0 SP5 and 10.1.

HowTo:  Set the default Home folder and geodatabase location for new map documents

Of course this requires getting SP5 installed!


If you are hacking the registry anyways, you can accomplish 90% of that in any version of 10.0. You can set the default geoprocessing output to any gdb location you want.
0 Kudos
JasonScheirer
Occasional Contributor III
When I backported GPX to Features in 10.1 I found that the time conversion tools do not work even at SP5 so I have to go back to dateutil.


Did you talk to tech support about it? What tool exactly didn't work?
0 Kudos