# Correlation and regression using numpy

Blog Post created by Dan_Patterson on Aug 17, 2014

Without any explanation...I just didn't want to forget it.  More to follow when I get the graphing stuff finished.  Served as a good opportunity to explore numpy in more detail.  No effort was made to simplify it down further.  I will be adding shapefile reading capabilities as well.  Just open it and run it...a simple data set is contained within.  I have just copied and pasted it here until the issues with python encoding and IE 11 are sorted out...sorry

'''
correlation.py

Author:  Dan.Patterson@carleton.ca

Purpose:
calculates the correlation coefficient and regression parameters for simple
correlation using numpy
'''
import numpy as np

def correlation(xs,ys):
s_x = np.std(xs, ddof=1, dtype=np.float64)
s_y = np.std(ys, ddof=1, dtype=np.float64)
covar = np.cov(xs,ys)[0][1]
r = covar/(s_x * s_y)
return r

def regress1D(xs, ys, dim=1):
'''simple first-order least squares regression in the form y=mx+b'''
coeff = np.polyfit(xs,ys,dim)
polynomial = np
polynomial = np.poly1d(coeff)
y_cal = polynomial(xs)
return [coeff, y_cal]

if __name__ == "__main__":
xs = [0,1,2,3,4,5,6,7,8,9,10]; ys = [0.2,0.7,2.7,2.6,4.1,5,5.7,7.3,7.9,9.1,9.8]
r = correlation(xs,ys)
coeff, y_cal = regress1D(xs,ys,1)
text ='y = %.3fx + %.3f' % (coeff[0],coeff[1])
print "\nPearson's r: ", r
print "Equation (y=mx+b): ", text