Branched from: https://community.esri.com/thread/199323-re-trend-analysis-through-time-series-of-raster-data
Hi Xander Bakker, Some modification I want to do in correlation script.
I Would like to calculate the P value as a raster in addition to correlation raster.
Using scipy library, I just ran one script to obtain the correlation and P value. Script is attached below
import numpy as np
from scipy.stats import pearsonr
x = np.array([ 58295.62187335, 45420.95483714, 3398.64920064, 977.22166306, 5515.32801851, 14184.57621022, 16027.2803392 , 15313.01865824, 6443.2448182 ])
y = np.array([ 143547.79123381, 22996.69597427, 2591.56411049, 661.93115277, 8826.96549102, 17735.13549851, 11629.13003263, 14438.33177173, 6997.89334741])
r,p = pearsonr(x,y)
print "r", r
print "p", p
Output
Returns: | r : 0.831087956392
p-value : 0.00550539621039
|
---|
So instead of Numpy, can we use here Numpy and Scipy both library to get the addition p value or in the existing script can we add this function to obtain the p Value as a raster
Thanks & regards
Shouvik Jha
Solved! Go to Solution.
Hi shouvik jha , I branched this question to give it more visibility and to avoid letting the post grow too much and contain multiple questions.
Basically, what I would do is define a second numpy that will hold the pearson value. This is what it would look like:
def main():
print "load modules..."
import arcpy
arcpy.env.overwriteOutput = True
import numpy as np
template1 = r'C:\GeoNet\Correlation\Parameter1\r{0}_NPP.TIF'
template2 = r'C:\GeoNet\Correlation\Parameter2\r{0}_WUE.TIF'
nodata = -3.4028235e+38
out_ras_corr = r'C:\GeoNet\Correlation\Pearson\correlation.TIF'
out_ras_pear = r'C:\GeoNet\Correlation\Pearson\pearson.TIF'
print "create nested numpy array list..."
lst_np_ras = []
for i in range(1, 14):
ras_path1 = template1.format("%03d" % (i,))
print " - ", ras_path1
ras_np1 = arcpy.RasterToNumPyArray(ras_path1)
ras_path2 = template2.format("%03d" % (i,))
print " - ", ras_path2
ras_np2 = arcpy.RasterToNumPyArray(ras_path2)
lst_np_ras.append([ras_np1, ras_np2])
print "read props numpy raster..."
ras_np = lst_np_ras[0][0] # take first numpy array from list
rows = ras_np.shape[0]
cols = ras_np.shape[1]
print " - rows:", rows
print " - cols:", cols
print "create output numpy array..."
ras_path = template1.format("%03d" % (1,))
raster = arcpy.Raster(ras_path)
ras_np_res_corr = np.ndarray((rows, cols))
ras_np_res_pear = np.ndarray((rows, cols))
print " - out rows:", ras_np_res_corr.shape[0]
print " - out cols:", ras_np_res_corr.shape[1]
print "loop through pixels..."
pix_cnt = 0
for row in range(rows):
for col in range(cols):
pix_cnt += 1
if pix_cnt % 5000 == 0:
print " - row:", row, " col:", col, " pixel:", pix_cnt
lst_vals1 = []
lst_vals2 = []
try:
for lst_pars in lst_np_ras:
lst_vals1.append(lst_pars[0][row, col])
lst_vals2.append(lst_pars[1][row, col])
lst_vals1 = ReplaceNoData(lst_vals1, nodata)
lst_vals2 = ReplaceNoData(lst_vals2, nodata)
# perform calculation on list
correlation = CalculateCorrelation(lst_vals1, lst_vals2, nodata)
ras_np_res_corr[row, col] = correlation
pearson = CalculatePearsons(lst_vals1, lst_vals2, nodata)
ras_np_res_pear[row, col] = pearson
except Exception as e:
print "ERR:", e
print " - row:", row, " col:", col, " pixel:", pix_cnt
print " - lst_vals1:", lst_vals1
print " - lst_vals2:", lst_vals2
pnt = arcpy.Point(raster.extent.XMin, raster.extent.YMin) # - raster.meanCellHeight
xcellsize = raster.meanCellWidth
ycellsize = raster.meanCellHeight
print "Write output rasters..."
print " - ", out_ras_corr
ras_res_corr = arcpy.NumPyArrayToRaster(ras_np_res_corr, lower_left_corner=pnt, x_cell_size=xcellsize,
y_cell_size=ycellsize, value_to_nodata=nodata)
ras_res_corr.save(out_ras_corr)
arcpy.DefineProjection_management(in_dataset=out_ras_corr, coor_system="GEOGCS['GCS_WGS_1984',DATUM['D_WGS_1984',SPHEROID['WGS_1984',6378137.0,298.257223563]],PRIMEM['Greenwich',0.0],UNIT['Degree',0.0174532925199433]]")
print " - ", out_ras_pear
ras_res_pear = arcpy.NumPyArrayToRaster(ras_np_res_pear, lower_left_corner=pnt, x_cell_size=xcellsize,
y_cell_size=ycellsize, value_to_nodata=nodata)
ras_res_pear.save(out_ras_pear)
arcpy.DefineProjection_management(in_dataset=out_ras_pear, coor_system="GEOGCS['GCS_WGS_1984',DATUM['D_WGS_1984',SPHEROID['WGS_1984',6378137.0,298.257223563]],PRIMEM['Greenwich',0.0],UNIT['Degree',0.0174532925199433]]")
def CalculateCorrelation(a, b, nodata):
import numpy
try:
coef = numpy.corrcoef(a,b)
return coef[0][1]
except:
return nodata
def ReplaceNoData(lst, nodata):
res = []
for a in lst:
if a < nodata / 2.0:
res.append(None)
else:
res.append(a)
return res
def CalculatePearsons(a, b, nodata):
import numpy
from scipy.stats import pearsonr
try:
x = np.array(a)
y = np.array(b)
r, p = pearsonr(a, y)
return p
except:
return nodata
if __name__ == '__main__':
main()
At this moment I am getting a "RuntimeError: ERROR 999998: Unexpected Error" on line 73 when saving the output correlation raster. I will look into that and see what the problem is. You could try and run the code to see if it works on your system.
Hi shouvik jha , just like the original code has the function CalculateCorrelation:
def CalculateCorrelation(a, b, nodata):
import numpy
try:
coef = numpy.corrcoef(a,b)
return coef[0][1]
except:
return nodata
... you could implement one that calculates p:
def CalculatePearsons(a, b, nodata):
import numpy
from scipy.stats import pearsonr
try:
x = np.array(a)
y = np.array(b)
r, p = pearsonr(a, y)
return p
except:
return nodata
Include the function and on line 62 change:
correlation = CalculateCorrelation(lst_vals1, lst_vals2, nodata)
by (although the "correlation" will be the p value) :
correlation = CalculatePearsons(lst_vals1, lst_vals2, nodata)
I did not try the code though.
Hi Xander Bakker, Thank you for the suggestion.
I trying to modify the code as you suggested but i don't understand, where I have to call the p Value raster.
From our earlier script, We got the output only Correlation raster, but this time I want Correlation raster as well as P value raster.
Snap from our earlier script on Input and output:
import numpy as np
template1 = r'D:\Correlation\Parameter1\r{0}_NPP.TIF'
template2 = r'D:\Correlation\Parameter2\r{0}_WUE.TIF'
nodata = -3.4028235e+38
out_ras_Corr = r'D:\Correlation\correlation.TIF' # Already exist in our script
out_ras_p = r'D:\Correlation\Significance.TIF' # this one i want to add
Hi Xander Bakker, I am facing same problem here also.
we eagerly await your response
Thanks & regards
Shouvik Jha
Hi shouvik jha , I branched this question to give it more visibility and to avoid letting the post grow too much and contain multiple questions.
Basically, what I would do is define a second numpy that will hold the pearson value. This is what it would look like:
def main():
print "load modules..."
import arcpy
arcpy.env.overwriteOutput = True
import numpy as np
template1 = r'C:\GeoNet\Correlation\Parameter1\r{0}_NPP.TIF'
template2 = r'C:\GeoNet\Correlation\Parameter2\r{0}_WUE.TIF'
nodata = -3.4028235e+38
out_ras_corr = r'C:\GeoNet\Correlation\Pearson\correlation.TIF'
out_ras_pear = r'C:\GeoNet\Correlation\Pearson\pearson.TIF'
print "create nested numpy array list..."
lst_np_ras = []
for i in range(1, 14):
ras_path1 = template1.format("%03d" % (i,))
print " - ", ras_path1
ras_np1 = arcpy.RasterToNumPyArray(ras_path1)
ras_path2 = template2.format("%03d" % (i,))
print " - ", ras_path2
ras_np2 = arcpy.RasterToNumPyArray(ras_path2)
lst_np_ras.append([ras_np1, ras_np2])
print "read props numpy raster..."
ras_np = lst_np_ras[0][0] # take first numpy array from list
rows = ras_np.shape[0]
cols = ras_np.shape[1]
print " - rows:", rows
print " - cols:", cols
print "create output numpy array..."
ras_path = template1.format("%03d" % (1,))
raster = arcpy.Raster(ras_path)
ras_np_res_corr = np.ndarray((rows, cols))
ras_np_res_pear = np.ndarray((rows, cols))
print " - out rows:", ras_np_res_corr.shape[0]
print " - out cols:", ras_np_res_corr.shape[1]
print "loop through pixels..."
pix_cnt = 0
for row in range(rows):
for col in range(cols):
pix_cnt += 1
if pix_cnt % 5000 == 0:
print " - row:", row, " col:", col, " pixel:", pix_cnt
lst_vals1 = []
lst_vals2 = []
try:
for lst_pars in lst_np_ras:
lst_vals1.append(lst_pars[0][row, col])
lst_vals2.append(lst_pars[1][row, col])
lst_vals1 = ReplaceNoData(lst_vals1, nodata)
lst_vals2 = ReplaceNoData(lst_vals2, nodata)
# perform calculation on list
correlation = CalculateCorrelation(lst_vals1, lst_vals2, nodata)
ras_np_res_corr[row, col] = correlation
pearson = CalculatePearsons(lst_vals1, lst_vals2, nodata)
ras_np_res_pear[row, col] = pearson
except Exception as e:
print "ERR:", e
print " - row:", row, " col:", col, " pixel:", pix_cnt
print " - lst_vals1:", lst_vals1
print " - lst_vals2:", lst_vals2
pnt = arcpy.Point(raster.extent.XMin, raster.extent.YMin) # - raster.meanCellHeight
xcellsize = raster.meanCellWidth
ycellsize = raster.meanCellHeight
print "Write output rasters..."
print " - ", out_ras_corr
ras_res_corr = arcpy.NumPyArrayToRaster(ras_np_res_corr, lower_left_corner=pnt, x_cell_size=xcellsize,
y_cell_size=ycellsize, value_to_nodata=nodata)
ras_res_corr.save(out_ras_corr)
arcpy.DefineProjection_management(in_dataset=out_ras_corr, coor_system="GEOGCS['GCS_WGS_1984',DATUM['D_WGS_1984',SPHEROID['WGS_1984',6378137.0,298.257223563]],PRIMEM['Greenwich',0.0],UNIT['Degree',0.0174532925199433]]")
print " - ", out_ras_pear
ras_res_pear = arcpy.NumPyArrayToRaster(ras_np_res_pear, lower_left_corner=pnt, x_cell_size=xcellsize,
y_cell_size=ycellsize, value_to_nodata=nodata)
ras_res_pear.save(out_ras_pear)
arcpy.DefineProjection_management(in_dataset=out_ras_pear, coor_system="GEOGCS['GCS_WGS_1984',DATUM['D_WGS_1984',SPHEROID['WGS_1984',6378137.0,298.257223563]],PRIMEM['Greenwich',0.0],UNIT['Degree',0.0174532925199433]]")
def CalculateCorrelation(a, b, nodata):
import numpy
try:
coef = numpy.corrcoef(a,b)
return coef[0][1]
except:
return nodata
def ReplaceNoData(lst, nodata):
res = []
for a in lst:
if a < nodata / 2.0:
res.append(None)
else:
res.append(a)
return res
def CalculatePearsons(a, b, nodata):
import numpy
from scipy.stats import pearsonr
try:
x = np.array(a)
y = np.array(b)
r, p = pearsonr(a, y)
return p
except:
return nodata
if __name__ == '__main__':
main()
At this moment I am getting a "RuntimeError: ERROR 999998: Unexpected Error" on line 73 when saving the output correlation raster. I will look into that and see what the problem is. You could try and run the code to see if it works on your system.
Hi Xander Bakker Thank you very much for such wonderful script.
However I am running the script on my system and will inform you shortly.
Í just noted an error: line 104 should be:
import numpy as np
I also changed lines 73 and 79 into :
arcpy.CopyRaster_management(ras_res_corr, out_ras_corr)
and
arcpy.CopyRaster_management(ras_res_pear, out_ras_pear)
On a small subset it worked without problems. I did not check the resulting Pearson value though...
Hi xander_bakker, Thank you very much for your such great cooperation. We are really very thankful for such wonderful and useful script from you.
The updated script is running perfectly and produced the exact result.
Thanks & regards
Shouvik Jha
Glad to hear that the code worked. Could you mark the question answered?
Kind regards, Xander