# Calculate Correlation and Pearson between two sets of rasters

18000
11
08-21-2017 11:33 AM Occasional Contributor III

Hi Xander Bakker‌,  Some modification I want to do in correlation script.

I Would like to calculate the P value as a raster in addition to correlation raster.

Using scipy library, I just ran one script to obtain the correlation and P value. Script is attached below

``````import numpy as np
from scipy.stats import pearsonr
x = np.array([ 58295.62187335,  45420.95483714,   3398.64920064,    977.22166306, 5515.32801851,  14184.57621022,  16027.2803392 ,  15313.01865824, 6443.2448182 ])
y = np.array([ 143547.79123381,   22996.69597427,    2591.56411049, 661.93115277,    8826.96549102,   17735.13549851, 11629.13003263,   14438.33177173,    6997.89334741])

r,p = pearsonr(x,y)

print "r", r

print "p", p

‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍``````

Output

Returns: r : 0.831087956392Pearson’s correlation coefficientp-value : 0.005505396210392-tailed p-value

So instead of Numpy, can we use here Numpy and Scipy both library to get the addition p value or in the existing  script can we add this function to obtain the p Value as a raster

Thanks & regards

Shouvik Jha

Tags (7)
1 Solution

Accepted Solutions by Esri Esteemed Contributor

Hi shouvik jha , I branched this question to give it more visibility and to avoid letting the post grow too much and contain multiple questions.

Basically, what I would do is define a second numpy that will hold the pearson value. This is what it would look like:

• line 11 defined the pearson output raster path and name
• line 35 creates the pearson numpy array
• line 57 and 58 will call the pearson function and store the result in the numpy array
• line 76 to 80 will store the output pearson raster
• line 103 to 112 contains the function that calculated the pearson value

``````def main():
import arcpy
arcpy.env.overwriteOutput = True

import numpy as np
template1 = r'C:\GeoNet\Correlation\Parameter1\r{0}_NPP.TIF'
template2 = r'C:\GeoNet\Correlation\Parameter2\r{0}_WUE.TIF'
nodata = -3.4028235e+38
out_ras_corr = r'C:\GeoNet\Correlation\Pearson\correlation.TIF'
out_ras_pear = r'C:\GeoNet\Correlation\Pearson\pearson.TIF'

print "create nested numpy array list..."
lst_np_ras = []
for i in range(1, 14):
ras_path1 = template1.format("%03d" % (i,))
print " - ", ras_path1
ras_np1 = arcpy.RasterToNumPyArray(ras_path1)
ras_path2 = template2.format("%03d" % (i,))
print " - ", ras_path2
ras_np2 = arcpy.RasterToNumPyArray(ras_path2)
lst_np_ras.append([ras_np1, ras_np2])

ras_np = lst_np_ras  # take first numpy array from list
rows = ras_np.shape
cols = ras_np.shape
print " - rows:", rows
print " - cols:", cols

print "create output numpy array..."
ras_path = template1.format("%03d" % (1,))
raster = arcpy.Raster(ras_path)
ras_np_res_corr = np.ndarray((rows, cols))
ras_np_res_pear = np.ndarray((rows, cols))
print " - out rows:", ras_np_res_corr.shape
print " - out cols:", ras_np_res_corr.shape

print "loop through pixels..."
pix_cnt = 0
for row in range(rows):
for col in range(cols):
pix_cnt += 1
if pix_cnt % 5000 == 0:
print " - row:", row, "  col:", col, "  pixel:", pix_cnt
lst_vals1 = []
lst_vals2 = []
try:
for lst_pars in lst_np_ras:
lst_vals1.append(lst_pars[row, col])
lst_vals2.append(lst_pars[row, col])
lst_vals1 = ReplaceNoData(lst_vals1, nodata)
lst_vals2 = ReplaceNoData(lst_vals2, nodata)
# perform calculation on list
correlation = CalculateCorrelation(lst_vals1, lst_vals2, nodata)
ras_np_res_corr[row, col] = correlation
pearson = CalculatePearsons(lst_vals1, lst_vals2, nodata)
ras_np_res_pear[row, col] = pearson
except Exception as e:
print "ERR:", e
print " - row:", row, "  col:", col, "  pixel:", pix_cnt
print " - lst_vals1:", lst_vals1
print " - lst_vals2:", lst_vals2

pnt = arcpy.Point(raster.extent.XMin, raster.extent.YMin) #  - raster.meanCellHeight
xcellsize = raster.meanCellWidth
ycellsize = raster.meanCellHeight

print "Write output rasters..."
print " - ", out_ras_corr
ras_res_corr = arcpy.NumPyArrayToRaster(ras_np_res_corr, lower_left_corner=pnt, x_cell_size=xcellsize,
y_cell_size=ycellsize, value_to_nodata=nodata)
ras_res_corr.save(out_ras_corr)
arcpy.DefineProjection_management(in_dataset=out_ras_corr, coor_system="GEOGCS['GCS_WGS_1984',DATUM['D_WGS_1984',SPHEROID['WGS_1984',6378137.0,298.257223563]],PRIMEM['Greenwich',0.0],UNIT['Degree',0.0174532925199433]]")

print " - ", out_ras_pear
ras_res_pear = arcpy.NumPyArrayToRaster(ras_np_res_pear, lower_left_corner=pnt, x_cell_size=xcellsize,
y_cell_size=ycellsize, value_to_nodata=nodata)
ras_res_pear.save(out_ras_pear)
arcpy.DefineProjection_management(in_dataset=out_ras_pear, coor_system="GEOGCS['GCS_WGS_1984',DATUM['D_WGS_1984',SPHEROID['WGS_1984',6378137.0,298.257223563]],PRIMEM['Greenwich',0.0],UNIT['Degree',0.0174532925199433]]")

def CalculateCorrelation(a, b, nodata):
import numpy
try:
coef = numpy.corrcoef(a,b)
return coef
except:
return nodata

def ReplaceNoData(lst, nodata):
res = []
for a in lst:
if a < nodata / 2.0:
res.append(None)
else:
res.append(a)
return res

def CalculatePearsons(a, b, nodata):
import numpy
from scipy.stats import pearsonr
try:
x = np.array(a)
y = np.array(b)
r, p = pearsonr(a, y)
return p
except:
return nodata

if __name__ == '__main__':
main()‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍``````

At this moment I am getting a "RuntimeError: ERROR 999998: Unexpected Error" on line 73 when saving the output correlation raster. I will look into that and see what the problem is. You could try and run the code to see if it works on your system.

11 Replies by Esri Esteemed Contributor

Hi shouvik jha , just like the original code has the function CalculateCorrelation:

``````def CalculateCorrelation(a, b, nodata):
import numpy
try:
coef = numpy.corrcoef(a,b)
return coef
except:
return nodata‍‍‍‍‍‍‍``````

... you could implement one that calculates p:

``````def CalculatePearsons(a, b, nodata):
import numpy
from scipy.stats import pearsonr
try:
x = np.array(a)
y = np.array(b)
r, p = pearsonr(a, y)
return p
except:
return nodata‍‍‍‍‍‍‍‍‍‍``````

Include the function and on line 62 change:

``correlation = CalculateCorrelation(lst_vals1, lst_vals2, nodata)‍``

by (although the "correlation" will be the p value) :

``correlation = CalculatePearsons(lst_vals1, lst_vals2, nodata)‍``

I did not try the code though. Occasional Contributor III

Hi Xander Bakker‌, Thank you for the suggestion.

I trying to modify the code as you suggested but i don't understand, where I have to call the p Value raster.

From our earlier script, We got the output only Correlation raster, but this time I want Correlation raster as well as P value raster.

Snap from our earlier script on Input and output:

``````import numpy as np
template1 = r'D:\Correlation\Parameter1\r{0}_NPP.TIF'
template2 = r'D:\Correlation\Parameter2\r{0}_WUE.TIF'
nodata = -3.4028235e+38
out_ras_Corr = r'D:\Correlation\correlation.TIF' # Already exist in our script
out_ras_p    = r'D:\Correlation\Significance.TIF' # this one i want to add
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍`````` Occasional Contributor III

Hi Xander Bakker‌, I am facing same problem here also.

Thanks & regards

Shouvik Jha by Esri Esteemed Contributor

Hi shouvik jha , I branched this question to give it more visibility and to avoid letting the post grow too much and contain multiple questions.

Basically, what I would do is define a second numpy that will hold the pearson value. This is what it would look like:

• line 11 defined the pearson output raster path and name
• line 35 creates the pearson numpy array
• line 57 and 58 will call the pearson function and store the result in the numpy array
• line 76 to 80 will store the output pearson raster
• line 103 to 112 contains the function that calculated the pearson value

``````def main():
import arcpy
arcpy.env.overwriteOutput = True

import numpy as np
template1 = r'C:\GeoNet\Correlation\Parameter1\r{0}_NPP.TIF'
template2 = r'C:\GeoNet\Correlation\Parameter2\r{0}_WUE.TIF'
nodata = -3.4028235e+38
out_ras_corr = r'C:\GeoNet\Correlation\Pearson\correlation.TIF'
out_ras_pear = r'C:\GeoNet\Correlation\Pearson\pearson.TIF'

print "create nested numpy array list..."
lst_np_ras = []
for i in range(1, 14):
ras_path1 = template1.format("%03d" % (i,))
print " - ", ras_path1
ras_np1 = arcpy.RasterToNumPyArray(ras_path1)
ras_path2 = template2.format("%03d" % (i,))
print " - ", ras_path2
ras_np2 = arcpy.RasterToNumPyArray(ras_path2)
lst_np_ras.append([ras_np1, ras_np2])

ras_np = lst_np_ras  # take first numpy array from list
rows = ras_np.shape
cols = ras_np.shape
print " - rows:", rows
print " - cols:", cols

print "create output numpy array..."
ras_path = template1.format("%03d" % (1,))
raster = arcpy.Raster(ras_path)
ras_np_res_corr = np.ndarray((rows, cols))
ras_np_res_pear = np.ndarray((rows, cols))
print " - out rows:", ras_np_res_corr.shape
print " - out cols:", ras_np_res_corr.shape

print "loop through pixels..."
pix_cnt = 0
for row in range(rows):
for col in range(cols):
pix_cnt += 1
if pix_cnt % 5000 == 0:
print " - row:", row, "  col:", col, "  pixel:", pix_cnt
lst_vals1 = []
lst_vals2 = []
try:
for lst_pars in lst_np_ras:
lst_vals1.append(lst_pars[row, col])
lst_vals2.append(lst_pars[row, col])
lst_vals1 = ReplaceNoData(lst_vals1, nodata)
lst_vals2 = ReplaceNoData(lst_vals2, nodata)
# perform calculation on list
correlation = CalculateCorrelation(lst_vals1, lst_vals2, nodata)
ras_np_res_corr[row, col] = correlation
pearson = CalculatePearsons(lst_vals1, lst_vals2, nodata)
ras_np_res_pear[row, col] = pearson
except Exception as e:
print "ERR:", e
print " - row:", row, "  col:", col, "  pixel:", pix_cnt
print " - lst_vals1:", lst_vals1
print " - lst_vals2:", lst_vals2

pnt = arcpy.Point(raster.extent.XMin, raster.extent.YMin) #  - raster.meanCellHeight
xcellsize = raster.meanCellWidth
ycellsize = raster.meanCellHeight

print "Write output rasters..."
print " - ", out_ras_corr
ras_res_corr = arcpy.NumPyArrayToRaster(ras_np_res_corr, lower_left_corner=pnt, x_cell_size=xcellsize,
y_cell_size=ycellsize, value_to_nodata=nodata)
ras_res_corr.save(out_ras_corr)
arcpy.DefineProjection_management(in_dataset=out_ras_corr, coor_system="GEOGCS['GCS_WGS_1984',DATUM['D_WGS_1984',SPHEROID['WGS_1984',6378137.0,298.257223563]],PRIMEM['Greenwich',0.0],UNIT['Degree',0.0174532925199433]]")

print " - ", out_ras_pear
ras_res_pear = arcpy.NumPyArrayToRaster(ras_np_res_pear, lower_left_corner=pnt, x_cell_size=xcellsize,
y_cell_size=ycellsize, value_to_nodata=nodata)
ras_res_pear.save(out_ras_pear)
arcpy.DefineProjection_management(in_dataset=out_ras_pear, coor_system="GEOGCS['GCS_WGS_1984',DATUM['D_WGS_1984',SPHEROID['WGS_1984',6378137.0,298.257223563]],PRIMEM['Greenwich',0.0],UNIT['Degree',0.0174532925199433]]")

def CalculateCorrelation(a, b, nodata):
import numpy
try:
coef = numpy.corrcoef(a,b)
return coef
except:
return nodata

def ReplaceNoData(lst, nodata):
res = []
for a in lst:
if a < nodata / 2.0:
res.append(None)
else:
res.append(a)
return res

def CalculatePearsons(a, b, nodata):
import numpy
from scipy.stats import pearsonr
try:
x = np.array(a)
y = np.array(b)
r, p = pearsonr(a, y)
return p
except:
return nodata

if __name__ == '__main__':
main()‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍``````

At this moment I am getting a "RuntimeError: ERROR 999998: Unexpected Error" on line 73 when saving the output correlation raster. I will look into that and see what the problem is. You could try and run the code to see if it works on your system. Occasional Contributor III

Hi Xander Bakker Thank you very much for such wonderful script.

However I am running the script on my system and will inform you shortly. by Esri Esteemed Contributor

Í just noted an error: line 104 should be:

``import numpy as np‍`` by Esri Esteemed Contributor

I also changed lines 73 and 79 into :

``    arcpy.CopyRaster_management(ras_res_corr, out_ras_corr)‍``

and

``    arcpy.CopyRaster_management(ras_res_pear, out_ras_pear)‍``

On a small subset it worked without problems. I did not check the resulting Pearson value though... Occasional Contributor III

Hi xander_bakker‌, Thank you very much for your such great cooperation. We are really very thankful for such wonderful and useful script from you.

The updated script is running perfectly and produced the exact result.

Thanks & regards

Shouvik Jha by Esri Esteemed Contributor

Glad to hear that the code worked. Could you mark the question answered?

Kind regards, Xander 