Calculate monthly average AOD based on the Julian day wise data.

730
13
Jump to solution
09-30-2018 07:47 PM
BIJOYGAYEN
New Contributor II
import datetime
start = datetime.datetime.now()
print 'start run: %s\n' % (start)
import arcpy, os
from arcpy import sa
from arcpy import env
import glob
arcpy.env.overwriteOutput = True
#outdir=r"G:\\ANFIS_FINAL\\AOD\\3KM_AOD\\"
di = r"G:\\ANFIS_FINAL\\AOD\\3KM_AOD\\201-18"
lstrasters = glob.glob(di + os.sep + "*Corrected_Optical_Depth_Land_2-Corrected_Optical_Depth_Land.tif")
lstrasters.sort()


M01 =[]
M02 =[]
M03 =[]
M04 =[]
M05 =[]
M06 =[]
M07 =[]
M08 =[]
M09 =[]
M10 =[]
M11 =[]
M12 =[]

yer=2017
for ras in lstrasters:
    yrr=str(ras[49:(len(ras)-116)])
    year=int(yrr)
    #print year
    if year == yer:
     jdate=str(ras[53:(len(ras)-113)])
     day=int(jdate)
     #print day
     sa.Raster(ras)
     if day in range     (1,32):
        M01.append(ras)
     elif day in range  (32,60):
        M02.append(ras)
     elif day in range  (60,91):
        M03.append(ras)
     elif day in range (91,121):
        M04.append(ras)
     elif day in range(121,152):
        M05.append(ras)
     elif day in range(152,182):
        M06.append(ras)
     elif day in range(182,213):
        M07.append(ras)
     elif day in range(213,244):
        M08.append(ras)
     elif day in range(244,274):
        M09.append(ras)
     elif day in range(274,305):
        M10.append(ras)
     elif day in range(305,335):
        M11.append(ras)
     elif day in range (335,366):
        M12.append(ras)

     else:
        print "Jdate out of range"
    else:
        print ("No available Data in this year:"+ str(yer))
        #yer+=0001



print M01
print M02
print M03
print M04
print M05
print M06
print M07
print M08
print M09
print M10
print M11
print M12



def Null_sc_mean(input,output):
 SCM00_000=[]
 if len(input)!= 0:
             for ds in input:
               print ds
               dat=str(ds[59:(len(ds)-113)])
               setnull =arcpy.gp.SetNull_sa(ds,ds, "in_memory/dat", "\"Value\" < 0")
               ras=arcpy.Raster(setnull)
               scalefactor=float(0.001)
               scale=scalefactor*ras
               SCM00_000.append(scale)
               if len(SCM00_000)!=0:
                   arcpy.gp.CellStatistics_sa(SCM00_000,output,"MEAN", "DATA")

extra=1
m3 ="G:\\ANFIS_FINAL\\AOD\\3KM_AOD\\01_18\\"
for ls in M01,M02,M03,M04,M05,M06,M07,M08,M09,M10,M11,M12:
  out1= m3+'{0}.img'.format("Month"+ str(extra)+str(yer))
  mon1= Null_sc_mean(ls,out1)
  extra += 1
  if extra==13:
    yer+=0001




print arcpy.GetMessages()
print 'finished run: %s\n\n' % (datetime.datetime.now() - start)





I have Julian day wise AOD data 2017/06 to 2018/05. Every day(...2017152...) have three files(...Land1, Land2, Land3), here we'll take only the land2 data after that I want to calculate the monthly average value from month wise data and save it in raster format with month wise by the loop. Before going to the average function we have some additional calculation is that every reading raster file first removes or deleting specific value ( less than 0 to -9999) and multiplied with 0.001 value, and then going to the Average calculation function. 

When  I am running this code then there is some anomaly. like when I am processing the data specifying subsequent months of a year or years then it saves the output files haphazardly. For example, my desired years are 2017 and 2018. It picks all the file for the specified years, but it saves the output file for 2018 in 2017 file and along with its own output file.

kindly give me a resolution or any other alternative way, and also I want to set this code leap year also.

I have attached some sample data.

0 Kudos
1 Solution

Accepted Solutions
RandyBurton
MVP Regular Contributor

The traceback error suggests that the "from datetime import datetime" did not execute properly.  It may be a Python version, so I would suggest checking documentation for your version.  Also, "from datetime import datetime" also imports only a section of the datetime module; so you wouldn't want to also "import datetime".

This line takes a Julian date in the format YYYYJJJ and converts it to a date, then the month is extracted from the date before being converted to an integer.  The section of code that it is in uses the year and month to insert the data into a dictionary of dictionaries.

month = int(datetime.strptime(jdate, '%Y%j').date().strftime("%m"))‍‍

This section reads the nested dictionary. The outer keys are the years, and the value are the dictionaries containing the months.  When you iterate through the items in these values, you get the month keys and their corresponding values.  In your case, it looks like you will pass the v2 list to your Null_sc_mean function.

for k1,v1 in rasters.iteritems(): # k1 outer key (year), v1 outer values (month dictionaries)
    print "Year: {}".format(k1)
    for k2,v2 in v1.iteritems(): # k2 inner key (month), v2 inner values (file list)
        if len(v2):
            print "\tMonth: {}".format(k2)
            for r in v2:
                print '\t\t{}'.format(r)‍‍‍‍‍‍‍

I haven't tested it yet, but the above section should become something like this:

m3 = "G:\\ANFIS_FINAL\\AOD\\3KM_AOD\\01_18\\"

for k1,v1 in rasters.iteritems():
    year =k1
    for k2,v2 in v1.iteritems():
        if len(v2):
            month =k2
            out1= '{}Month{}_{}.img'.format(m3,month,year)
            # print out1
            mon1 = Null_sc_mean(v2,out1) # not sure if mon1 is what you need‍‍‍‍‍‍‍‍‍‍

I'm not sure what you intend to do with the variable mon1.  As you will be looping through the months and years in file folder, this variable will be overwritten with each iteration.  But since the Null_sc_mean returns None (there is no return value in that section of code) it is not important.  However, you may want the function to return a success/fail code that you can check in this section.  I haven't looked at the workings of the function that is being called.

I'm still studying your code and it looks like you are trying to get the filenames organized into a series of lists.  The dictionary approach simplifies it.  I also noticed that you are trying to go through those lists at line 102, but I think you are missing some brackets.

for ls in M01,M02,M03,M04,M05,M06,M07,M08,M09,M10,M11,M12:

# try (note the brackets):

for ls in [M01,M02,M03,M04,M05,M06,M07,M08,M09,M10,M11,M12]:‍‍‍‍‍

When I am writing new code, I will have lots of print statements that will show the values of variables, etc. at various points in the code.  As testing progresses, I will comment out the print statements and add additional code for testing.

Hope this helps.

View solution in original post

13 Replies
BIJOYGAYEN
New Contributor II

Please, somebody, help me to solve my problem.

0 Kudos
RandyBurton
MVP Regular Contributor

Breaking the problem into smaller steps, we can use a dictionary to group the raster files.  I assume the filenames are consistent in format?

from datetime import datetime

# lstrasters = glob.glob(di + os.sep + "*Corrected_Optical_Depth_Land_2-Corrected_Optical_Depth_Land.tif")
lstrasters = [  # list from your directory search 
    'MOD04_3K.A20161.mosaic.061.2017271115004.pssgmcrpgs_000501268215.Corrected_Optical_Depth_Land_2-Corrected_Optical_Depth.Land.tif',
    'MOD04_3K.A20162.mosaic.061.2017271115004.pssgmcrpgs_000501268215.Corrected_Optical_Depth_Land_2-Corrected_Optical_Depth.Land.tif',
    'MOD04_3K.A2017152.mosaic.061.2017271115004.pssgmcrpgs_000501268215.Corrected_Optical_Depth_Land_2-Corrected_Optical_Depth.Land.tif',
    'MOD04_3K.A2017153.mosaic.061.2017271115004.pssgmcrpgs_000501268215.Corrected_Optical_Depth_Land_2-Corrected_Optical_Depth.Land.tif',
    'MOD04_3K.A2017254.mosaic.061.2017271115004.pssgmcrpgs_000501268215.Corrected_Optical_Depth_Land_2-Corrected_Optical_Depth.Land.tif',
    'MOD04_3K.A2017255.mosaic.061.2017271115004.pssgmcrpgs_000501268215.Corrected_Optical_Depth_Land_2-Corrected_Optical_Depth.Land.tif',
    'MOD04_3K.A2018356.mosaic.061.2017271115004.pssgmcrpgs_000501268215.Corrected_Optical_Depth_Land_2-Corrected_Optical_Depth.Land.tif',
    'MOD04_3K.A2018357.mosaic.061.2017271115004.pssgmcrpgs_000501268215.Corrected_Optical_Depth_Land_2-Corrected_Optical_Depth.Land.tif'
    ]  # these 'rasters' are for months 1 in 2016 (will error), months 6 and 9 in 2017, and month 12 in 2018

rasters = { 2017: {1:[],2:[],3:[],4:[],5:[],6:[],7:[],8:[],9:[],10:[],11:[],12:[]},
            2018: {1:[],2:[],3:[],4:[],5:[],6:[],7:[],8:[],9:[],10:[],11:[],12:[]}
            } # starting dictionary

for r in lstrasters:
    jdate = r.split(".")[1][1:] # get date portion, splitting on periods
    # jdate = r[10:17] # if date is always 7 characters and same position
    year = int(jdate[:4])
    month = int(datetime.strptime(jdate, '%Y%j').date().strftime("%m"))
    if year in rasters.keys():
        rasters[year][month].append(r)
    else:
        print "ERROR year {}: {}".format(year, r)

for k1,v1 in rasters.iteritems():
    print "Year: {}".format(k1)
    for k2,v2 in v1.iteritems():
        if len(v2):
            print "\tMonth: {}".format(k2)
            for r in v2:
                print '\t\t{}'.format(r)

''' Output:
ERROR year 2016: MOD04_3K.A20161.mosaic.061.2017271115004.pssgmcrpgs_000501268215.Corrected_Optical_Depth_Land_2-Corrected_Optical_Depth.Land.tif
ERROR year 2016: MOD04_3K.A20162.mosaic.061.2017271115004.pssgmcrpgs_000501268215.Corrected_Optical_Depth_Land_2-Corrected_Optical_Depth.Land.tif
Year: 2017
     Month: 6
          MOD04_3K.A2017152.mosaic.061.2017271115004.pssgmcrpgs_000501268215.Corrected_Optical_Depth_Land_2-Corrected_Optical_Depth.Land.tif
          MOD04_3K.A2017153.mosaic.061.2017271115004.pssgmcrpgs_000501268215.Corrected_Optical_Depth_Land_2-Corrected_Optical_Depth.Land.tif
     Month: 9
          MOD04_3K.A2017254.mosaic.061.2017271115004.pssgmcrpgs_000501268215.Corrected_Optical_Depth_Land_2-Corrected_Optical_Depth.Land.tif
          MOD04_3K.A2017255.mosaic.061.2017271115004.pssgmcrpgs_000501268215.Corrected_Optical_Depth_Land_2-Corrected_Optical_Depth.Land.tif
Year: 2018
     Month: 12
          MOD04_3K.A2018356.mosaic.061.2017271115004.pssgmcrpgs_000501268215.Corrected_Optical_Depth_Land_2-Corrected_Optical_Depth.Land.tif
          MOD04_3K.A2018357.mosaic.061.2017271115004.pssgmcrpgs_000501268215.Corrected_Optical_Depth_Land_2-Corrected_Optical_Depth.Land.tif
'''‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Are you anticipating some invalid Julian dates - day 0 or 367, for example?  If so, can you explain a bit more?

Next step would be to process your files.  I take another look at your code later.

BIJOYGAYEN
New Contributor II
Traceback (most recent call last):
  File "<module1>", line 33, in <module>
AttributeError: 'module' object has no attribute 'strptime'

This code getting me this error. Can you explain line 23 and 30 to 35? Actually, I have 2015 to 2018 datasets I want to calculate the mean of each month and save it by their month and year name. Befor we go to the mean calculation have some additional calculation that I mention it in first.

0 Kudos
RandyBurton
MVP Regular Contributor

The traceback error suggests that the "from datetime import datetime" did not execute properly.  It may be a Python version, so I would suggest checking documentation for your version.  Also, "from datetime import datetime" also imports only a section of the datetime module; so you wouldn't want to also "import datetime".

This line takes a Julian date in the format YYYYJJJ and converts it to a date, then the month is extracted from the date before being converted to an integer.  The section of code that it is in uses the year and month to insert the data into a dictionary of dictionaries.

month = int(datetime.strptime(jdate, '%Y%j').date().strftime("%m"))‍‍

This section reads the nested dictionary. The outer keys are the years, and the value are the dictionaries containing the months.  When you iterate through the items in these values, you get the month keys and their corresponding values.  In your case, it looks like you will pass the v2 list to your Null_sc_mean function.

for k1,v1 in rasters.iteritems(): # k1 outer key (year), v1 outer values (month dictionaries)
    print "Year: {}".format(k1)
    for k2,v2 in v1.iteritems(): # k2 inner key (month), v2 inner values (file list)
        if len(v2):
            print "\tMonth: {}".format(k2)
            for r in v2:
                print '\t\t{}'.format(r)‍‍‍‍‍‍‍

I haven't tested it yet, but the above section should become something like this:

m3 = "G:\\ANFIS_FINAL\\AOD\\3KM_AOD\\01_18\\"

for k1,v1 in rasters.iteritems():
    year =k1
    for k2,v2 in v1.iteritems():
        if len(v2):
            month =k2
            out1= '{}Month{}_{}.img'.format(m3,month,year)
            # print out1
            mon1 = Null_sc_mean(v2,out1) # not sure if mon1 is what you need‍‍‍‍‍‍‍‍‍‍

I'm not sure what you intend to do with the variable mon1.  As you will be looping through the months and years in file folder, this variable will be overwritten with each iteration.  But since the Null_sc_mean returns None (there is no return value in that section of code) it is not important.  However, you may want the function to return a success/fail code that you can check in this section.  I haven't looked at the workings of the function that is being called.

I'm still studying your code and it looks like you are trying to get the filenames organized into a series of lists.  The dictionary approach simplifies it.  I also noticed that you are trying to go through those lists at line 102, but I think you are missing some brackets.

for ls in M01,M02,M03,M04,M05,M06,M07,M08,M09,M10,M11,M12:

# try (note the brackets):

for ls in [M01,M02,M03,M04,M05,M06,M07,M08,M09,M10,M11,M12]:‍‍‍‍‍

When I am writing new code, I will have lots of print statements that will show the values of variables, etc. at various points in the code.  As testing progresses, I will comment out the print statements and add additional code for testing.

Hope this helps.

BIJOYGAYEN
New Contributor II
def Null_sc_mean(input,output):
 SCM00_000=[]
 if len(input)!= 0:
             for ds in input:
               print ds
               dat=str(ds[59:(len(ds)-113)])
               setnull =arcpy.gp.SetNull_sa(ds,ds, "in_memory/dat", "\"Value\" < 0")
               ras=arcpy.Raster(setnull)
               scalefactor=float(0.001)
               scale=scalefactor*ras
               SCM00_000.append(scale)
               if len(SCM00_000)!=0:
                   arcpy.gp.CellStatistics_sa(SCM00_000,output,"MEAN", "DATA")

extra=1
m3 ="G:\\ANFIS_FINAL\\AOD\\3KM_AOD\\01_18\\"
for ls in M01,M02,M03,M04,M05,M06,M07,M08,M09,M10,M11,M12:
  out1= m3+'{0}.img'.format("Month"+ str(extra)+str(yer))
  Null_sc_mean(ls,out1)
  extra += 1
  if extra==13:
    yer+=0001

  

for ls in M01,M02,M03,M04,M05,M06,M07,M08,M09,M10,M11,M12:

 I want to avoid this function because this is so complex and errors. I have 2015 to 2018 datasets, some month have no data in this situation when I save this data by the loop after complecting Mean I think it will give me the error message. I want to save those months in which they have data. 

 mon1 = Null_sc_mean(v2,out1)

mon1 nothing, just silly mistake

I will test your code tomorrow. 

0 Kudos
BIJOYGAYEN
New Contributor II

Thank you, sir, code working pretty well.

I have some question If I run this code from 2015 to 2018 then such kind of modification will work?

rasters = { 2015: {1:[],2:[],3:[],4:[],5:[],6:[],7:[],8:[],9:[],10:[],11:[],12:[]},
            2016: {1:[],2:[],3:[],4:[],5:[],6:[],7:[],8:[],9:[],10:[],11:[],12:[]}
            2017: {1:[],2:[],3:[],4:[],5:[],6:[],7:[],8:[],9:[],10:[],11:[],12:[]}
            2018: {1:[],2:[],3:[],4:[],5:[],6:[],7:[],8:[],9:[],10:[],11:[],12:[]}
            }

Second things I want to use the "Null_sc"  function for every raster file and save it by their  "YEAR/MONTH/DATE/AOD" name. Then what kind of modification do I do in the code? or your suggestion.

def Null_sc(input,output):
             for ds in input:
               print ds
               setnull =arcpy.gp.SetNull_sa(ds,ds, "in_memory/dat", "\"Value\" < 0")
               ras=arcpy.Raster(setnull)
               scalefactor=float(0.001)
               scale=scalefactor*ras
               save.scale(output)
0 Kudos
RandyBurton
MVP Regular Contributor

Commas are needed at the end of the lines (for 2015, 2016 and 2017 - not needed for 2018 as this is the end of the dictionary), but this should work if the files are found in the directory search.

rasters = { 2015: {1:[],2:[],3:[],4:[],5:[],6:[],7:[],8:[],9:[],10:[],11:[],12:[]},
            2016: {1:[],2:[],3:[],4:[],5:[],6:[],7:[],8:[],9:[],10:[],11:[],12:[]},
            2017: {1:[],2:[],3:[],4:[],5:[],6:[],7:[],8:[],9:[],10:[],11:[],12:[]},
            2018: {1:[],2:[],3:[],4:[],5:[],6:[],7:[],8:[],9:[],10:[],11:[],12:[]}
            }

You can pass the year and month in the function call, but isn't out1 the name for the output file? See line 14 below, and it is being passed to the function.  Its format is including the month and year.

def Null_sc(input,output, year, month):
             for ds in input:
                 # some code to process data
                 savename = "filename{}_{}".format(year,month)
                 # save data

# ........

for k1,v1 in rasters.iteritems():
    year =k1
    for k2,v2 in v1.iteritems():
        if len(v2):
            month =k2
            out1= '{}Month{}_{}.img'.format(m3,month,year)
            # print out1
            mon1 = Null_sc_mean(v2, out1, year, month) # also pass year, month

 

BIJOYGAYEN
New Contributor II

Its ok but my question is how to extract the date from the Julian day that I want to save with raster name.

0 Kudos
RandyBurton
MVP Regular Contributor

To work with Julian days:

>>> from datetime import datetime
>>> jdate = '2017152' # julian day extracted from filename
>>> datetime.strptime(jdate, '%Y%j').date().strftime("%Y") # 4 digit year
'2017'
>>> datetime.strptime(jdate, '%Y%j').date().strftime("%y") # 2 digit year
'17'
>>> datetime.strptime(jdate, '%Y%j').date().strftime("%m") # month
'06'
>>> datetime.strptime(jdate, '%Y%j').date().strftime("%d") # day
'01'
>>> datetime.strptime(jdate, '%Y%j').date() # as a datetime object
datetime.date(2017, 6, 1)
>>> 

Since you may need to convert dates in several places in your code, you could make it a function.  You can pass a Julian day or the filename.

from datetime import datetime

def parseJulian(jdate):
    dt = datetime.strptime(jdate, '%Y%j').date()
    return(dt.year, dt.month, dt.day)

def extractJulian(filename):
    jdate = filename.split(".")[1][1:] # get date portion, splitting on periods
    dt = datetime.strptime(jdate, '%Y%j').date()
    return(dt.year, dt.month, dt.day)


x = parseJulian('2017152')

print x[0] #year: 2017
print x[1] #month: 6
print x[2] #day: 1


x = extractJulian('MOD04_3K.A2017152.mosaic.061.2017271115004.pssgmcrpgs_000501268215.Corrected_Optical_Depth_Land_2-Corrected_Optical_Depth.Land.tif')

print x[0] #year: 2017
print x[1] #month: 6
print x[2] #day: 1‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

You could store this in your dictionary, with some code modification.  Basically you would insert the filename with the date information as a tuple. Dictionary would look something like:

{ 2017: { 0 : [('longfilename', year, month, day),...],...}...}