Skip navigation
All People > Dan_Patterson > Py... blog > 2014 > August

Py... blog

August 2014 Previous month Next month


Author:    Aug, 2014
  - Demonstrates line graph and scatterplot usage using pyplot and numpy
  - A sample shapefile is included for testing purposes


  - Simply unzip the zip file into a folder and run the will be
    asked to enter 1, 2 or 3 which represent the 3 different samples.
  - The script uses raw_input to get the number, it is slow in pyscripter for some


  - When Arcscripts 2.0 opens up a fuller toolbox will be posted which allows for more
    interactive mapping options.  Send me an email with any special requests.

A simple verbose demo again, so that I don't forget. 

I updated it to conform to python 3.x... which will inevitably hit ArcMap.

The script is pretty self explanatory and no effort has been made to simplify it.


  To demonstrate the utility of using the collections module to
  obtain a simple frequency distribution, in this case, a list
  of random integers.  It is written verbosely so that the user
  can see the sequence of events and the results of the various
  A sequence of random numbers is generated and a dictionary of
  key:values is produced by collections.Counter.  The resultant
  keys are cloned to "classes" to prevent alteration of the
  initial keys.  An extra class is appended to the list to ensure
  that the last class in the keys is included since the behaviour
  of histogram is to combine the last two classes into
  one frequency (long story).  I just add a value of 1 to the last
  class to produce an extra bin.
  A histogram is produced which contains the classes and the frequency
  for those classes.

import collections 
import random 
import numpy as np 
from matplotlib import pyplot as plt 
rand_int = [random.randrange(1,6) for i in range(15)] 
dict = collections.Counter(rand_int) 
keys = dict.keys() 
counts = dict.values() 
classes = list(keys)                #clone the keys 
classes.append(classes[-1] + 1)  #to ensure that the last bin has values 

histo = np.histogram(rand_int,classes)
args = [rand_int, dict, keys, counts, histo, histo[1], histo[0]] 
frmt = """
Collections and pylab
Random integers: {}
Collections dict:
  keys:           {}
  values (freq):  {} 
Histogram         {}
   classes:       {}
   frequency:     {}

plt.title("Sample Histogram", loc='center') 
plt.xlabel("class"); plt.ylabel("frequency")




Collections and pylab
Random integers: [4, 5, 3, 4, 2, 4, 1, 5, 4, 5, 4, 5, 2, 2, 3]
Collections dict:
     Counter({4: 5, 5: 4, 2: 3, 3: 2, 1: 1})
  keys:           dict_keys([1, 2, 3, 4, 5])
  values (freq):  dict_values([1, 3, 2, 5, 4]) 
Histogram         (array([1, 3, 2, 5, 4]), array([1, 2, 3, 4, 5, 6]))
   classes:       [1 2 3 4 5 6]
   frequency:     [1 3 2 5 4]


As a simple histogram.


Which of course can be fancied up to suit your needs.  Matplotlib is certainly one package to explore... and there are even high-end graphics modules.

Without any explanation...I just didn't want to forget it.  More to follow when I get the graphing stuff finished.  Served as a good opportunity to explore numpy in more detail.  No effort was made to simplify it down further.  I will be adding shapefile reading capabilities as well.  Just open it and run it...a simple data set is contained within.  I have just copied and pasted it here until the issues with python encoding and IE 11 are sorted out...sorry




  calculates the correlation coefficient and regression parameters for simple
  correlation using numpy
import numpy as np


def correlation(xs,ys):
  s_x = np.std(xs, ddof=1, dtype=np.float64)
  s_y = np.std(ys, ddof=1, dtype=np.float64)
  covar = np.cov(xs,ys)[0][1]
  r = covar/(s_x * s_y)
  return r


def regress1D(xs, ys, dim=1):
  '''simple first-order least squares regression in the form y=mx+b'''
  coeff = np.polyfit(xs,ys,dim)
  polynomial = np
  polynomial = np.poly1d(coeff)
  y_cal = polynomial(xs)
  return [coeff, y_cal]


if __name__ == "__main__":
  xs = [0,1,2,3,4,5,6,7,8,9,10]; ys = [0.2,0.7,2.7,2.6,4.1,5,5.7,7.3,7.9,9.1,9.8]
  r = correlation(xs,ys)
  coeff, y_cal = regress1D(xs,ys,1)
  text ='y = %.3fx + %.3f' % (coeff[0],coeff[1])
  print "\nPearson's r: ", r
  print "Equation (y=mx+b): ", text