I'm helping a researcher who is evaluating the pixel statistics of Night Light data by admin zone. She wants to calculate (along with normal zonal stats) the 5th and 95th percentile values for pixels in the raster by zone.
Given a raster with an attribute table, this would seem easy enough to accomplish but I can't figure out how to do this without tackling a conversion of the raster to points to allow for a spatial join between pixel values and admin ids before exporting to a table. Not practical for a global raster dataset.
Any help is appreciated.
David
as you are aware, if you can assume a normal distribution of the values within the zone, then all you need is the mean and standard deviation to calculate the percentile values at other z value points. Percentile - Wikipedia, the free encyclopedia For those that are not. However, other methods are going to require that you obtain the individual values within each zone. As you suggest this would require a conversion of the raster to points since each point represents a cell. This is fine if the cell area is in planar units (sq m etc) but impractical if sq degrees since cell area in 'real' units varies poleward.
You might want to take a different tact and examine the zonal statistics as table to see if you can make that assumption but if the median differs signifcantly from the mean AND/OR you are working in geographic coordinates, this approach will be faulty
Step one of almost any research project here is to get out of GCS, unless the interpolation that causes changes the raster values we need to evaluate. With data like LandScan global pop rasters you don't want interpolate the pop values before measuring them, the same might be true of the night lights data but I think the variation is probably minimal at the admin 2 level.
My stats background is limited so I wasn't aware that you could derive the percentile from the mean and standard deviation. I doubt the data are going to be normally distributed though, it's likely to be a skewed distribution for many parts of the world with far more dark pixels than very bright. I'll have the researcher sample a few zones to be sure.
Apart from that I take it there is no simple method for tackling this question? Even breaking the raster up by admin 2 zones before vectorizing seems daunting.
I have an example script posted here: Re: Zonal statistics - percentiles