Select to view content in your preferred language

Zonal statistics - percentiles

30349
65
Jump to solution
06-30-2014 01:35 PM
toddsams
New Contributor III
Hello,

I am looking for an analogous method to Zonal Statistics that is able to generate user-defined percentiles (such as the 5th and 95th percentile) of the input raster data.

I looked at the GME tools and those stats metrics are hardwired in the same fashion as ArcGIS Zonal Statistics tools are.

Any other ideas that I can explore?
65 Replies
XanderBakker
Esri Esteemed Contributor

I wonder if there is much to resolve... The polygons you have are very small in relation to the raster pixels. Some polygons expand over less than 2 pixels.

During the script it will extract those pixels that are "inside" the polygon. In that case there might only be one pixel and all resulting percentiles will be the same.

There is another thing that influences the result drastically. The GetPixelValueForPercentile function will return the value of the pixel when the cumulative percentage is less or equal to the percentile searched for. When there are that little elements in the dictionary of values, this will have a huge impact. Initially, the lowest value of the dictionary is assigned as result. However, with two elements in the dictionary, also 98% will yield the first value in the dictionary, since the second value corresponds to 100%...

What exactly are you looking for with this data?

ZengYi-Chao
New Contributor

Xander

Thanks for your suggestions and help. I had found the other way to resolve my problem. But i still have a question what the bigger polygons included several pixels (some more than 10 pixels) are the same result for all percentiles as the small polygons.

0 Kudos
XanderBakker
Esri Esteemed Contributor

Hi Zeng Yi-Chao , I had a new look at your comment and indeed it should not return the same values if the polygons intersects rasters cells with different values. I did some tests and it seems that there are some issues with the extraction of the polygon from the raster using ExtractByPolygon. I changed this to ExtractByMask providing the polygon geometry as input and that seems to correct the problem. However for smaller polygon (less than a certain percentage of the size of a single pixel) it will not return any result.

To correct this one could apply a buffer to these smaller polygons so it will at least contain a pixel.

I will attach some code later after I test the concept on your data.

Kind regards, Xander

XanderBakker
Esri Esteemed Contributor

I just ran a test on a single part version of the shapefile you provided. Below the code used and attached the resulting shapefile.

import arcpy

def main():
    # settings
    ras = r"C:\GeoNet\ZonalStatsSmall\test20150910\i2015080723001.tif"
    fc = r"C:\GeoNet\ZonalStatsSmall\test20150910\testshp67sp.shp"
    lst_perc = [2, 5, 10, 25, 50, 75, 90, 95, 98] # list of percentiles to be calculated
    fld_prefix = "Perc_"
    start_at = 0

    arcpy.env.workspace = "IN_MEMORY"
    arcpy.env.overwriteOutput = True

    # add fields if they do not exist in fc
    # be aware, fields that exist will be overwritten!
    print "filling field list and adding fields"
    flds = []
    for perc in lst_perc:
        fld_perc = "{0}{1}".format(fld_prefix, perc)
        flds.append(fld_perc)
        if not FieldExist(fc, fld_perc):
            arcpy.AddField_management(fc, fld_perc, "LONG")
    print "flds={0}".format(flds)

    # Enable Spatial analyst
    arcpy.CheckOutExtension("Spatial")

    # get cell area
    desc = arcpy.Describe(ras)
    cell_area = desc.meanCellHeight * desc.meanCellWidth

    # loop through polygons
    print "loop through polygons"
    flds.append("SHAPE@")
    i = 0
    with arcpy.da.UpdateCursor(fc, flds) as curs:
        for row in curs:
            i += 1
            print "Processing polygon: {0}".format(i)
            if i >= start_at:
                polygon = row[flds.index("SHAPE@")]

                if polygon.area < cell_area:
                    polygon = polygon.buffer((desc.meanCellHeight + desc.meanCellWidth) / 4)

                # Execute ExtractByPolygon (you can't send the polygon object)
                print " - ExtractByMask..."

                # arcpy.sa.ExtractByPolygon(ras, lst_parts, "INSIDE")
                ras_pol = arcpy.sa.ExtractByMask(ras, polygon)
                del polygon
                outname = "ras{0}".format(i)
                ras_pol.save(outname)
                print " - saved raster as {0}".format(outname)

                # create dictionary with value vs count
                print " - fill dict with Value x Count"
                flds_ras = ("VALUE", "COUNT")
                dct = {row[0]:row[1] for row in arcpy.da.SearchCursor(outname, flds_ras)}
                # del ras_pol

                # calculate number of pixels in raster
                print " - determine sum"
                cnt_sum = sum(dct.values())
                print " - sum={0}".format(cnt_sum)

                # loop through dictionary and create new dictionary with val vs percentile
                print " - create percentile dict"
                dct_per = {}
                cnt_i = 0
                for val in sorted(dct.keys()):
                    cnt_i += dct[val]
                    dct_per[val] = cnt_i / cnt_sum
                del dct

                # loop through list of percentiles
                print " - iterate perceniles"
                for perc in lst_perc:
                    # use dct_per to determine percentiles
                    perc_dec = float(perc) / 100
                    print "  - Perc_dec for is {0}".format(perc_dec)
                    pixval =  GetPixelValueForPercentile(dct_per, perc_dec)
                    print "  - Perc for {0}% is {1}".format(perc, pixval)

                    # write pixel value to percentile field
                    print "  - Store value"
                    fld_perc = "{0}{1}".format(fld_prefix, perc)
                    row[flds.index(fld_perc)] = pixval
                del dct_per

                # update row
                print " - update row"
                curs.updateRow(row)

    # return SA license
    arcpy.CheckInExtension("Spatial")

def GetPixelValueForPercentile(dctper, percentile):
    """will return last pixel value
      where percentile LE searched percentile."""
    pix_val = -1
    try:
        pix_val = sorted(dctper.keys())[0] # initially assign lowest pixel value
        for k in sorted(dctper.keys()):
            perc = dctper
            if perc <= percentile:
                pix_val = k
            else:
                break
    except Exception as e:
        pass
    return pix_val

def FieldExist(featureclass, fieldname):
    """Check if field exists"""
    fieldList = arcpy.ListFields(featureclass, fieldname)
    return len(fieldList) == 1

if __name__ == '__main__':
    main()
ZengYi-Chao
New Contributor

Xander

i am grateful for your help and assistance. The code seems feasible and work.  It saves me much time on the analysis of data

Best regards

0 Kudos
XanderBakker
Esri Esteemed Contributor

I'm glad it's useful. If it is Helpful to you, you may want to consider using the "Helpful" link so that other users may find this answer easier.

Kind regards, Xander

BennoPerchermeier
New Contributor II

Thanks Xander, this script is very helpful!! 

I did 3 edits:

- Like Axel Thomas, I had to change the workspace from "IN_MEMORY" to a Geodatabase, because I was working with large data. Worked like a charm. (In every loop it overwrites the previously created little rasters)

- I inculded a snap raster environment, because I actually found good use in the small rasters created.

- Lastly I changed the print statements for arcpy.AddMessage statements just because thats my personal style. Also, this might avoide issues with Python 3.

# ---------------------------------------------------------------------------
# Title: CalcPercentiles
# Created on: 2015-09-26 12:42 by Xander
# Last Edit: 2019-01-25 11:45 by Benno
# Description: Takes a raster and calculates percentiles for segments in a given mask.
# - As Input this script needs a (segmented) feature class and a raster.
# - The percentile values will be listed in the table of the mask.
# - (In the Scratch workspace the script saves the little rasters that are created by ExtractByMask. They can be useful
# later on.)
# ---------------------------------------------------------------------------
import arcpy


def main():
    # settings
    ras = r"D:\Extractor\ras.gdb\DEM"
    fc = r"D:\Extractor\ras.gdb\pols"
    # ras = r"C:\GeoNet\ZonalStatsSmall\test20150910\i2015080723001.tif"
    # fc = r"C:\GeoNet\ZonalStatsSmall\test20150910\testshp67sp.shp"
    lst_perc = [2, 5, 10, 25, 50, 75, 80, 90, 95, 98]  # list of percentiles to be calculated
    fld_prefix = "Perc_"
    start_at = 0

    arcpy.env.workspace = r"D:\Extractor\ScratchWS.gdb"
    arcpy.env.snapRaster = r"D:\Extractor\ras.gdb\DEM"
    arcpy.env.overwriteOutput = True

    # add fields if they do not exist in fc
    # be aware, fields that exist will be overwritten!
    arcpy.AddMessage("filling field list and adding fields")
    flds = []
    for perc in lst_perc:
        fld_perc = "{0}{1}".format(fld_prefix, perc)
        flds.append(fld_perc)
        if not FieldExist(fc, fld_perc):
            arcpy.AddField_management(fc, fld_perc, "LONG")
    arcpy.AddMessage("flds={0}".format(flds))

    # Enable Spatial analyst
    arcpy.CheckOutExtension("Spatial")

    # get cell area
    desc = arcpy.Describe(ras)
    cell_area = desc.meanCellHeight * desc.meanCellWidth

    # loop through polygons
    arcpy.AddMessage("loop through polygons")
    flds.append("SHAPE@")
    i = 0
    with arcpy.da.UpdateCursor(fc, flds) as curs:
        for row in curs:
            i += 1
            arcpy.AddMessage("Processing polygon: {0}".format(i))
            if i >= start_at:
                polygon = row[flds.index("SHAPE@")]

                if polygon.area < cell_area:
                    polygon = polygon.buffer((desc.meanCellHeight + desc.meanCellWidth) / 4)

                    # Execute ExtractByPolygon (you can't send the polygon object)
                arcpy.AddMessage(" - ExtractByMask...")

                # arcpy.sa.ExtractByPolygon(ras, lst_parts, "INSIDE")
                ras_pol = arcpy.sa.ExtractByMask(ras, polygon)
                del polygon
                outname = "ras{0}".format(i)
                ras_pol.save(outname)
                arcpy.AddMessage(" - saved raster as {0}".format(outname))

                # create dictionary with value vs count
                arcpy.AddMessage(" - fill dict with Value x Count")
                flds_ras = ("VALUE", "COUNT")
                dct = {row[0]: row[1] for row in arcpy.da.SearchCursor(outname, flds_ras)}
                # del ras_pol

                # calculate number of pixels in raster
                arcpy.AddMessage(" - determine sum")
                cnt_sum = sum(dct.values())
                arcpy.AddMessage(" - sum={0}".format(cnt_sum))

                # loop through dictionary and create new dictionary with val vs percentile
                arcpy.AddMessage(" - create percentile dict")
                dct_per = {}
                cnt_i = 0
                for val in sorted(dct.keys()):
                    cnt_i += dct[val]
                    dct_per[val] = cnt_i / cnt_sum
                del dct

                # loop through list of percentiles
                arcpy.AddMessage(" - iterate perceniles")
                for perc in lst_perc:
                    # use dct_per to determine percentiles
                    perc_dec = float(perc) / 100
                    arcpy.AddMessage("  - Perc_dec for is {0}".format(perc_dec))
                    pixval = GetPixelValueForPercentile(dct_per, perc_dec)
                    arcpy.AddMessage("  - Perc for {0}% is {1}".format(perc, pixval))

                    # write pixel value to percentile field
                    arcpy.AddMessage("  - Store value")
                    fld_perc = "{0}{1}".format(fld_prefix, perc)
                    row[flds.index(fld_perc)] = pixval
                del dct_per

                # update row
                arcpy.AddMessage(" - update row")
                curs.updateRow(row)

                # return SA license
    #arcpy.CheckInExtension("Spatial")


def GetPixelValueForPercentile(dctper, percentile):
    """will return last pixel value
      where percentile LE searched percentile."""
    pix_val = -1
    try:
        pix_val = sorted(dctper.keys())[0]  # initially assign lowest pixel value
        for k in sorted(dctper.keys()):
            perc = dctper[k]
            if perc <= percentile:
                pix_val = k
            else:
                break
    except Exception as e:
        pass
    return pix_val


def FieldExist(featureclass, fieldname):
    """Check if field exists"""
    fieldList = arcpy.ListFields(featureclass, fieldname)
    return len(fieldList) == 1


if __name__ == '__main__':
    main()

‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍
IalinaVinci
New Contributor

Hi, unfortunately I have the spatial analyst linked to an ArcGis 9.3 licence, so I I tried the script adapted to python 2.6 and ArcGIS9.3, so far I've been able to do searching in the net (I'm a beginner). I run the script and I've got his message:

Traceback (most recent call last):

  File "C:\Python26\Lib\site-packages\pythonwin\pywin\framework\scriptutils.py", line 326, in RunScript

    exec codeObject in __main__.__dict__

  File "C:\Documents and Settings\lfantinato\Documenti\zone_per2.py", line 34, in <module>

    gp.env.Workspace = r"G:\ialina2\dtm\zon_st.gdb"

  File "C:\Python26\lib\site-packages\win32com\client\dynamic.py", line 522, in __getattr__

    raise AttributeError("%s.%s" % (self._username_, attr))

AttributeError: esriGeoprocessing.GpDispatch.1.env

what are the problems? is it possible to fix them?

If someone is able to help me, I would be very grateful,

thank you

Ialina

this is the script adjusted by me:

def GetPixelValueForPercentile(dctper, percentile):
##"""will return last pixel value where percentile LE searched percentile."""
    for k in sorted(dctper.keys()):
        perc = dctper
        if perc <= percentile:
            pix_val = k
        else:
            break
    return pix_val

def FieldExist(featureclass, fieldname):
##    """Check if field exists"""
    import win32com.client
    gp = win32com.client.Dispatch("esriGeoprocessing.GpDispatch.1")
    fieldList = gp.ListFields(featureclass, fieldname)
    return len(fieldList) == 1

import win32com.client
gp = win32com.client.Dispatch("esriGeoprocessing.GpDispatch.1")
##    settings
ras = r"G:\ialina2\dtm\dtmx10clipgrd"
fc = r"G:\ialina2\dtm\polig_250clip"
## ras = r"C:\Path\To\Folder\Or\Geodatabase\With\Raster"
## fc = r"C:\Path\To\Folder\Or\Geodatabase\With\PolygonFeatureclass"
lst_perc = [2, 5, 10, 90, 95, 98] ## list of percentiles to be calculated
fld_prefix = "Perc_"

gp.env.Workspace = r"G:\ialina2\dtm\zon_st.gdb"
##arcpy.env.workspace = "IN_MEMORY"
gp.env.overwriteOutput = True

## add fields if they do not exist in fc
## be aware, field that exist will be overwritten!
print "filling field list and adding fields"
flds = []
for perc in lst_perc:
    fld_perc = "{0}{1}".format(fld_prefix, perc)
    flds.append(fld_perc)
    if not FieldExist(fc, fld_perc):
        gp.AddField_management(fc, fld_perc, "LONG")
print "flds={0}".format(flds)

## Enable Spatial analyst
gp.CheckOutExtension("Spatial")

## loop through polygons
print "loop through polygons"
flds.append("SHAPE@")
i = 0
with gp.da.UpdateCursor(fc, flds) as curs:
    for row in curs:
        i += 1
        print "Processing polygon: {0}".format(i)
        polygon = row[flds.index("SHAPE@")]
        lst_parts = []
        if polygon.partCount == 1:
            for part in polygon:
                for pnt in part:
                    x, y = pnt.X, pnt.Y
                    lst_parts.append(gp.Point(x, y))
        else:
            for part in polygon:
                lst_crds = []
                for pnt in part:
                    x, y = pnt.X, pnt.Y
                    lst_crds.append(gp.Point(x, y))
                lst_parts.append(lst_crds)
            print " - lst_parts={0}".format(lst_parts)

        ## Execute ExtractByPolygon (you can't send the polygon object)
        print " - ExtractByPolygon..."
        ras_pol = gp.sa.ExtractByPolygon(ras, lst_parts, "INSIDE")
        outname = "ras{0}".format(i)
        ras_pol.save(outname)
        print " - saved raster as {0}".format(outname)

        ## create dictionary with value vs count
        print " - fill dict with Value x Count"
        flds_ras = ("VALUE", "COUNT")
        dct = dict((row[0],row[1]) for row in gp.da.SearchCursor(outname, flds_ras))

        ## calculate number of pixels in raster
        print " - determine sum"
        cnt_sum = sum(dct.values())
        print " - sum={0}".format(cnt_sum)

        ## loop through dictionary and create new dictionary with val vs percentile
        print " - create percentile dict"
        dct_per = {}
        cnt_i = 0
        for val in sorted(dct.keys()):
            cnt_i += dct[val]
            dct_per[val] = cnt_i / cnt_sum

        ## loop through list of percentiles
        print " - iterate percentiles"
        for perc in lst_perc:
            ## use dct_per to determine percentiles
            perc_dec = float(perc) / 100
            print "   - Perc_dec for is {0}".format(perc_dec)
            pixval =  GetPixelValueForPercentile(dct_per, perc_dec)
            print "   - Perc for {0}% is {1}".format(perc, pixval)

            ## write pixel value to percentile field
            print "   - Store value"
            fld_perc = "{0}{1}".format(fld_prefix, perc)
            row[flds.index(fld_perc)] = pixval

        ## update row
        print " - update row"
        curs.updateRow(row)

## return SA license
gp.CheckInExtension("Spatial")

0 Kudos
XanderBakker
Esri Esteemed Contributor

That might be a challenge... The error points to the gdb (file geodatabase). Can you open this gdb (r"G:\ialina2\dtm\zon_st.gdb") in 9.3? If it has been created with a more recent version you will not be able to use it in 9.3.

I haven't looked at the code, but the original code is based on arcpy which was introduced at ArcGIS 10. You may encounter functionality that is not available in 9.3

XanderBakker
Esri Esteemed Contributor

Hi Ialina Vinci ,

There is no gp.da.UpdateCursor. The da (data access module was introduced at 10.1 SP1). You will have to use the "old" searchcursor which uses a different syntax for accessing field values.

You cannot use "SHAPE@" as field name. You will have to describe the featureclass ad access the ShapeFieldname property.

Kind regards, Xander