Which statistical library

Question asked by on Aug 9, 2018
I'd like opinions, no code necessary, on libraries for performing basic descriptive stats using the python tools that ship in the box with an ArcGIS 10.3 standard licence (so Python 2.7).  I'm behind a corporate firewall with no ability to install any additional libraries. 


I'm not planning in doing ML or predictive analytics just yet. I have a simple requirement to :

  • get simple aggregates (counts, count not null, count grouping by) 
  • measures of central tendency (mean, mode, median, range, variance & standard deviation)
  • box plots, histograms and pie charts
  • output results to XLS


Inputs would be file GDB feature classes ranging in volume from 5 records to 5 million records, with most  under 1 million records.   Execution would be infrequent, maybe once or twice a day to assist with generic data profiling.  I'm not hooking this up to a high volume web service, so speed is not critical.  The ability to write clear code is.


I suspect I could achieve my requirements with one or more of: 

  • arcpy.da.SearchCursor (yeah, the long way 'round)
  • arcpy.Statistics_analysis
  • scipy.stats
  • numpy
  • pandas
  • sqllite 
  • mapplotlib
  • xlwt


Getting into one or more of these will be a time investment that I'd hope to still be useful when I (hope) I will be able to upgrade to ArcGIS 10.6 and 'Pro with Python 3.6 in the next year. 


So, which do you think are worth learning for this purpose?