Calculate mean in 8 day intervals for thousand of daily raster files using cell statistics

1429
13
Jump to solution
11-19-2017 07:00 AM
LarryBiodun
New Contributor III

Hi everyone, I have folder with thousands of average daily relative humidity raster files that I would like to use cell statistics to calculate the mean in 8 day intervals. The data has dates in the format YYYYMMDD. I've looked at a similar code written after a similar question was asked but it doesnt do the 8 day averaging. I've included the link to the other answered question and a picture of what the data look like. Your help is greatly appreciated.Calculate daily average of raster files based on their names in python xander_bakkersample data

0 Kudos
1 Solution

Accepted Solutions
XanderBakker
Esri Esteemed Contributor

Hi larry32 , can you try this code?

Change:

  • input folder with .img files on line 6
  • output folder or fgdb on line 9 and the corresponding extension on line 10

def main():
    import arcpy
    import os

    # change this folder to the folder with the renamed img files
    input_folder = r'C:\GeoNet\Average8Days\output_files'

    # define the output workspace and extension
    output_ws = r'C:\GeoNet\Average8Days\gdb\myFileGeoDB.gdb'
    out_ext = ""

    # create list of rasters in input workspace
    arcpy.env.workspace = input_folder
    my_list = arcpy.ListRasters()

    # checkout a Spatial Analyst license
    arcpy.CheckExtension("Spatial")

    # loop through chunks of 8 rasters
    for my_list_chunk in chunks(my_list, 8):
        # define the output name
        first_name = my_list_chunk[0]
        name, ext = os.path.splitext(first_name)
        first_date = name[-8:]
        out_ras_name = "mean" + first_date + out_ext
        print out_ras_name, my_list_chunk

        # now we have:
        # - an output raster filename
        # - the list of rasters for calculating the mean value

        # perform the cellstatistics with MEAN
        cellstat = arcpy.sa.CellStatistics(my_list_chunk, "MEAN", "DATA")

        # define file path of output raster and save cell statistics result
        outname = os.path.join(output_ws, out_ras_name)
        cellstat.save(outname)


def chunks(l, n):
    """ Yield successive n-sized chunks from l."""
    for i in xrange(0, len(l), n):
        yield l[i:i+n]

if __name__ == '__main__':
    main()

View solution in original post

13 Replies
XanderBakker
Esri Esteemed Contributor

Hi Larry Biodun , 

First thing I notice is the "+" sign in the name of the img files. This may generate problems when you use them in any mathematical operation in ArcGIS / Python. So it might be necessary to start with renaming the rasters and shorten the names. This can be done with Python. Would this be an option?

Edit:

I also notice that half of the raster have a "T" after the "+" sign. And 2 rasters will reference the same date. Should the 8 day calculation include 16 rasters?

0 Kudos
LarryBiodun
New Contributor III

Hi Xander, thanks for your quick response. That was an error in that sample list i sent. Thanks for picking that up. The 8 day calculation should only include 8 rasters. I have included another pix of the actual correct data. I am happy to shorten the names. Infact anything after the first 14 characters dont matter.data2

0 Kudos
DanPatterson_Retired
MVP Emeritus

there is a thread here which Xander Bakker‌ led the charge on. https://community.esri.com/message/584750?commentID=584750#comment-584750

Then a parallel incarnation here  https://community.esri.com/thread/173806 shoul

 you require more statistical style calculations in either a block or running style calculation

XanderBakker
Esri Esteemed Contributor

The first step would be to apply some proper naming for the files. See below a snippet of code that could do this:

def main():
    from shutil import copyfile, move
    import os

    # list of extensions you want to copy or rename
    lst_ext = ['.img']

    input_folder = r'C:\GeoNet\Average8Days\input_files'
    output_folder = r'C:\GeoNet\Average8Days\output_files'

    for path, dirs, files in os.walk(input_folder):
        for filename in files:
            name, ext = os.path.splitext(filename)
            if ext in lst_ext:
                in_file = os.path.join(path, filename)
                print in_file
                out_name = name[:12] + ext
                out_file = os.path.join(output_folder, out_name)
                print out_file

                # decide to copy
                copyfile(in_file, out_file)

                # or move:
                move(in_file, out_file)

                # or rename:
                os.rename(in_file, out_file)


if __name__ == '__main__':
    main()

On line 6, you will have to include the extensions of the files you want to copy, move or rename.

On line 8 and 9 specify the paths of the input and output folder

Line 21 to 28, decide if you want to copy, which is safest, but you will need to additional disk space, or move or rename. If you have any dependencies on the current file name and locations I recommend you to copy the data.

Once you have the files in with usable names, you are ready for the next step.

0 Kudos
LarryBiodun
New Contributor III

Thanks Xander. The renaming is done. I chose the copy part of the code. I'll wait on the averaging code. Here is a snapshot of the copied files.Renamed data

0 Kudos
DanPatterson_Retired
MVP Emeritus

You can prep by reading Xander's post here, where the shell is outlined

https://community.esri.com/thread/171685?commentID=584750#comment-584750

LarryBiodun
New Contributor III

Thanks for that Dan

0 Kudos
XanderBakker
Esri Esteemed Contributor

Can you check if the renamed files open properly in ArcGIS (just open one), since only the .img file was copied. It will probably recreate the aux and xml files, but just to be sure that the data is valid.

0 Kudos
LarryBiodun
New Contributor III

Yes they do open in ArcGIS properly. The associated files were created when I opened it in ArcGIS.