Select to view content in your preferred language

Zonal statistics - percentiles

31396
65
Jump to solution
06-30-2014 01:35 PM
toddsams
Deactivated User
Hello,

I am looking for an analogous method to Zonal Statistics that is able to generate user-defined percentiles (such as the 5th and 95th percentile) of the input raster data.

I looked at the GME tools and those stats metrics are hardwired in the same fashion as ArcGIS Zonal Statistics tools are.

Any other ideas that I can explore?
65 Replies
XanderBakker
Esri Esteemed Contributor

So this is what I came up with (see code below). It creates a table with the following structure:

ResultingTable.png

The POL_OID field will be added to the polygons also to allow for a relate. It holds the raster name and the fields for the percentiles.

Change this:

  • line 6, point to the raster workspace (it will use all the rasters in that workspace). Remember use only integer rasters!
  • line 7 points to the input polygon feature class
  • line 7 points to the output table that will be created
  • line 10 to 13 contains other settings for percentiles and output fields

import arcpy
import os

def main():
    # settings
    ras_ws = r"C:\Forum\ZonalStatsPercLoop\gdb\Rasters.gdb" # input raster workspace
    fc = r"C:\Forum\ZonalStatsPercLoop\gdb\PolygonAndOutput.gdb\polygons" # input polygons
    tbl = r"C:\Forum\ZonalStatsPercLoop\gdb\PolygonAndOutput.gdb\ZonalStats5" # output related table

    lst_perc = [2, 5, 10, 90, 95, 98] # list of percentiles to be calculated
    fld_prefix = "Perc_"
    fld_poloid = "POL_OID"
    fld_rasname = "RasterName"

    # create table
    tbl_ws, tbl_name = os.path.split(tbl)
    arcpy.CreateTable_management(tbl_ws, tbl_name)

    # add fields and fill fields list
    flds_tbl = [fld_poloid, fld_rasname]
    arcpy.AddField_management(tbl, fld_poloid, "LONG")
    arcpy.AddField_management(tbl, fld_rasname, "TEXT", 50)
    for perc in lst_perc:
        fld_perc = "{0}{1}".format(fld_prefix, perc)
        arcpy.AddField_management(tbl, fld_perc, "LONG")
        flds_tbl.append(fld_perc)

    # Enable Spatial analyst
    arcpy.CheckOutExtension("Spatial")

    # get list of rasters
    lst_ras = getListOfRasterFromWS(ras_ws)

    # environments
    arcpy.env.workspace = "IN_MEMORY"
    arcpy.env.overwriteOutput = True

    # create dictionary with polygons OID vs lst_parts
    dct_pols = createDictPolygons(fc, fld_poloid)

    # start insert cursor for table
    with arcpy.da.InsertCursor(tbl, flds_tbl) as curs_tbl:

        # start loop through rasters
        i = 0
        for ras_name in lst_ras:
            ras = os.path.join(ras_ws, ras_name)
            i += 0
            print "Processing Raster '{0}'".format(ras_name)

            # loop through polygons
            for oid, lst_parts in dct_pols.items():
                lst_row = [oid, ras_name]
                print " - Processing polygon: {0}".format(oid)

                # Execute ExtractByPolygon (you can't send the polygon object)
                print "   - ExtractByPolygon..."
                ras_pol = arcpy.sa.ExtractByPolygon(ras, lst_parts, "INSIDE")
                outname = "ras{0}pol{1}".format(i, oid)
                ras_pol.save(outname)
                print "   - saved raster as {0}".format(outname)

                # create dictionary with value vs count
                print "   - fill dict with Value x Count"
                flds_ras = ("VALUE", "COUNT")
                dct = {row[0]:row[1] for row in arcpy.da.SearchCursor(outname, flds_ras)}

                # calculate number of pixels in raster
                print "   - determine sum"
                cnt_sum = sum(dct.values())
                print "   - sum={0}".format(cnt_sum)

                # loop through dictionary and create new dictionary with val vs percentile
                print "   - create percentile dict"
                dct_per = {}
                cnt_i = 0
                for val in sorted(dct.keys()):
                    cnt_i += dct[val]
                    dct_per[val] = cnt_i / cnt_sum

                # loop through list of percentiles
                print "   - iterate percentiles"
                for perc in lst_perc:
                    # use dct_per to determine percentiles
                    perc_dec = float(perc) / 100
                    print "     - Perc_dec for is {0}".format(perc_dec)
                    pixval =  GetPixelValueForPercentile(dct_per, perc_dec)
                    print "     - Perc for {0}% is {1}".format(perc, pixval)

                    # write pixel value to percentile field
                    print "     - Store value in list"
                    # fld_perc = "{0}{1}".format(fld_prefix, perc)
                    # row[flds.index(fld_perc)] = pixval
                    lst_row.append(pixval)

                # update row
                print "   - insert row"
                row = tuple(lst_row)
                curs_tbl.insertRow(row)

    # return SA license
    arcpy.CheckInExtension("Spatial")
    print "Ready..."


def createDictPolygons(fc, fld_poloid):
    flds = ("OID@", "SHAPE@", fld_poloid)

    # add pol oid field for link
    if not FieldExist(fc, fld_poloid):
        arcpy.AddField_management(fc, fld_poloid, "LONG")

    dct_pols = {}
    with arcpy.da.UpdateCursor(fc, flds) as curs:
        for row in curs:
            oid = row[0]
            polygon = row[1]
            row[2] = oid

            lst_parts = []
            if polygon.partCount == 1:
                for part in polygon:
                    for pnt in part:
                        x, y = pnt.X, pnt.Y
                        lst_parts.append(arcpy.Point(x, y))
            else:
                for part in polygon:
                    lst_crds = []
                    for pnt in part:
                        x, y = pnt.X, pnt.Y
                        lst_crds.append(arcpy.Point(x, y))
                    lst_parts.append(lst_crds)

            # add to dict and store pol oid
            dct_pols[oid] = lst_parts
            curs.updateRow(row)
    return dct_pols


def getListOfRasterFromWS(ras_ws):
    arcpy.env.workspace = ras_ws
    return arcpy.ListRasters()


def GetPixelValueForPercentile(dctper, percentile):
    """will return last pixel value
       where percentile LE searched percentile."""
    try:
        srt_keys = sorted(dctper.keys())
        pix_val = srt_keys[0]
        for k in srt_keys:
            perc = dctper
            if perc <= percentile:
                pix_val = k
            else:
                break
        return pix_val
    except Exception as e:
        print "    - GetPixelValueForPercentile error: {0}".format(e)
        print "dctper:\n{0}".format(dctper)
        return -9999

def FieldExist(featureclass, fieldname):
    """Check if field exists"""
    import arcpy
    fieldList = arcpy.ListFields(featureclass, fieldname)
    return len(fieldList) == 1

if __name__ == '__main__':
    main()

Have fun!

Kind regards, Xander

toddsams
Deactivated User

Works perfectly! Thanks once again Xander.

0 Kudos
XanderBakker
Esri Esteemed Contributor

Glad it works.

If you find it helpful, you may want to mark it as helpful.

Kind regards, Xander

AxelThomas
Regular Contributor

Hi Xander,

I used your code to determine percentiles from a raster but ran into trouble. After encountering a multi-part polygon the code listed the polygon vertices like that

[<Point (3502785.40681, 5526822.79862, #, #)>, etc.]

but failed when extracting polygons:

- ExtractByPolygon...

Traceback (most recent call last):

  File "W:\hydrogeology\scripts\percentiles.py", line 87, in <module>

    ras_pol = arcpy.sa.ExtractByPolygon(ras, lst_parts, "INSIDE")

  File "E:\Programme\ARCGIS\Desktop\Desktop10.1\arcpy\arcpy\sa\Functions.py", line 1231, in ExtractByPolygon

    extraction_area)

  File "E:\Programme\ARCGIS\Desktop\Desktop10.1\arcpy\arcpy\sa\Utils.py", line 47, in swapper

    result = wrapper(*args, **kwargs)

  File "E:\Programme\ARCGIS\Desktop\Desktop10.1\arcpy\arcpy\sa\Functions.py", line 1226, in wrapper

    extraction_area)

  File "E:\Programme\ARCGIS\Desktop\Desktop10.1\arcpy\arcpy\geoprocessing\_base.py", line 498, in <lambda>

    return lambda *args: val(*gp_fixargs(args, True))

RuntimeError: Object: Error in executing tool

Do you have any ideas what might cause the trouble? Your help is greatly appreciated!

Best regards,

Axel

0 Kudos
XanderBakker
Esri Esteemed Contributor

Hi Axel Thomas ,

Could you attach a/the polygon that caused the error? That would help me to trace what is going wrong. I could create a multi-polygon over the test area I have, but that might not be the same.

Kind regards, Xander

AxelThomas
Regular Contributor

Xander,

thanks for your fast response. I have attached the complete shape file. You can find the polygon that caused the trouble by looking at the percentile fields at the end of the table and scroll down until the values change to 0.

Hope that helps to fix the error.

Thanks in advance and best regards

Axel

0 Kudos
XanderBakker
Esri Esteemed Contributor

Hi Axel Thomas ,

Thanks for posting the data, I will look into it as soon as I get some things "of my plate"...

Kind regards, Xander

XanderBakker
Esri Esteemed Contributor

OK, so I loaded your features, used some Spatial Adjustment to move them over a dummy DEM that I have here, and ran the script which reproduced the error that you encountered. I made a small change to the script o assure that no matter is it is multipart it will simply add the points to the list as if it where a single part polygon.... and this works. Checking the result manually generated something very similar. So I think you could do this:

So change this line (around line 55):

        if polygon.partCount == 1:

into this:

        if polygon.partCount != 0:

... and please try again.

AxelThomas
Regular Contributor

Hi Xander,

replaced the line as you said and the code ran happily away - until polygon 60:

Processing polygon: 60

4300.2_5201

Traceback (most recent call last):

  File "W:\hydrogeologie\Verweilzeiten\modellierung_2010\Juelich\skripte\Perzentile_aus_raster.py", line 75, in <module>

    x, y = pnt.X, pnt.Y

AttributeError: 'NoneType' object has no attribute 'X'

I guess it means that the raster is smaller than the extent of the polygon layer and the sampled pixels contain NULL values? I checked the polygon in question and as far as I can see there should be no problem - the raster contains '0' values which I would expect to be processed by the code just as any other pixel value. Up until polygon 60 the code has processed quite a number of border polygons without problem. Polygon 60 again is a two-part polygon but with your last change the code processed other polygons of this kind without problem.

Do you have an idea?

.

0 Kudos
XanderBakker
Esri Esteemed Contributor

I will check what is going on with that particular case. What I think might be the case could be due to an island that has a None type point to indicate the island. Will have to see if that is really the case.

Thanks for adding the raster data, that will make things less complex (I had extract just a few polygons until the multipart) so with the raster I can see what else might be going wrong.