Select to view content in your preferred language

Creating different normal rasters within each entry of a polygon shapefile

2718
10
05-31-2010 06:37 PM
AdamKilpatrick
Emerging Contributor
Howdy

I have a polygon shapefile (of model paddocks...I'm making several models of arid rangelands to test some methodologies), made up of 400 square paddocks in a 20 by 20 grid arrangement. The idea with this model is that each paddock in the polygon shp has a different value, ranging from 1 to 400.

I need to create a raster that covers the entire shp. Within the extent of each paddock, the cells or pixels of the raster need to have a normal distribution with a mean equal to the value of the paddock (standard deviation should be about 10% of mean).

I'm having a bit of trouble finding how to create this...I'm resorting to python scripting of loops to help.

I can create a normal raster of each paddock within the shp, using the "create constant raster" tool, after I've selected that paddock in the loop. From what I can tell, this makes a normal raster with a mean of 0 (zero), and a standard deviation of, from what I can tell visually, 1. There aren't any options (unfortunately) with this tool to set your own mean and s.d. (Seriously...why not!).

Anyone have any ideas for working around this issue? I'm thinking of somehow normalizing (hehe) these "normal" rasters to values of a range of 0 to 1 (with subsequent mean of 0.5), and then doing a multiplication by twice the paddock value. I need to get the minimum and maximum values of the normal raster first though (then do a calculation somehow of (x-min)/(max-min) to do the 0 to 1 normalization. Then do the 2*paddock value multiplication. I'm not sure how to script either of these. At the end of the loop I'll merge the rasters that I've created for each paddock into one raster of the entire shapefile.

Any help would be greatly appreciated...this is my first foray into creating a python script...and I'm a fair bit out of my depth.
0 Kudos
10 Replies
AdamKilpatrick
Emerging Contributor
Thanks Bill, that makes a lot of sense and cuts to the chase.

I was making a mountain out of a molehill!

Cheers,

Adam 🙂
0 Kudos
AdamKilpatrick
Emerging Contributor
It is also worth noting that past versions of Spatial Analyst have used bad random number generation algorithms (which are undocumented). If you intend to use this work for published research, you should first run tests of the quality of these random values.

Bill, is this still an issue with the Spatial Analyst extension of ArcGIS 9.3 (with all the service packs installed)?
0 Kudos
AdamKilpatrick
Emerging Contributor
...I invite you and any other readers to conduct these tests with your versions of ArcGIS.


Bill, I've just created a random raster of 1024x1024 cells, the converted to integer * 64 as you suggested using raster calculator.

Exported the values and counts to excel and had a look. Looks pretty spot on for randomness, without even running a chi-square test (all pretty close to 16384 counts, departing at most about 12 to 15 counts + or -). I did Excel's chitest, and it came up with a value of 1. I'm not sure how to interpret that. I think I screwed it up (my stats skills are close to non-existant).

But the data certainly looks ok  by eye. I used the "create random raster" tool, in the spatial analyst toolbox of ArcGIS 9.3 sp1 (an Info workstation license I think8).

I'm guessing if that test works, then the normal distribution is more likely to be actually normal?

Cheers,

Adam
0 Kudos
AdamKilpatrick
Emerging Contributor
Oh Bugger. I see what you mean.

I think that value of 1 is the chi-squared value.

Any suggestions on how to make decent random and normal grids then? I'll have a look myself at what Erdas Imagine in and ENVI have, and just use the same extents and then import to ArcMap.

Is there a reason why ESRI hasn't corrected this?
0 Kudos
AdamKilpatrick
Emerging Contributor
Hi Bill

I just created a random (uniform) raster using ENVI 4.7, in its "Generate Test Data" tool.
There are also options for creating random (normal) rasters, constant rasters, guassian rasters...

This is looking more promising. I've attached a spreadsheet of the counts of this raster for you, as well as counts of the ArcGIS SA generated random grid. Both are 1024x1024 cells. The ENVI grid I chose the integer option, and chose a range of 0 to 63 which made the process a lot easier that having to recalculate the raster as I did in ArcMap.

One thing that irks me about the ENVI random(uniform) raster, is that the values 0 and 63 have counts ~1/2 of 16384. To do this, it would have to, on average, over allocate to 1,2,....,62. I'm not sure what's going on here...hoping that ENVI doesn't have a bug in its random generation too. I checked the total count and it is 1024^2.

Cheers,

Adam
0 Kudos
DanPatterson_Retired
MVP Emeritus
Bill
we would expect the chi-square statistic to be near 63 [the correct value is about 62.3].

I presume that this would be on average?
Using Python's random module, a test of 100 runs it appears to be, but not for any individual case:
Average chi_sq 62.3514575195 N runs: 100
Minimum chi_sq 39.1962890625
Maximum chi_sq 94.0982666016
using the attached code.  I can confirm your observations about the SA generated grids in 9.3.1

In any event, if Python's random module is suitable, then you can use the following to create ascii files which can be
converted (Conversion tools, To Raster, ASCII to Raster) to grids.

'''
write_ascii.py

specify the filename, classes etc below

'''
import random

output_file = "c:/temp/demo_ascii.asc"
classes = 64
ncols = 10
nrows = 10
xllcorner =    0
yllcorner =    0
cellsize  =    1
NODATA_value =  -9999

header = ["ncols          " + str(ncols) + "\n",
         "nrows          " + str(nrows) + "\n",
         "xllcorner      " + str(xllcorner) + "\n",
         "yllcorner      " + str(yllcorner) + "\n",
         "cellsize       " + str(cellsize) + "\n",
         "NODATA_value   " + str(NODATA_value)]

out_file = open(output_file,'w')
for i in header:
  out_file.write(i)
  
for i in range(nrows):
  a_line = "\n"
  for j in range(ncols):
    a_val = int(random.random() * classes)
    a_line += str(a_val) + " "
  out_file.write(a_line)
out_file.close()


I forgot to mention, that you can also use:
  randint( a, b) random integer between a and b
  uniform( a, b) random float between a and b
  betavariate( alpha, beta)
  gammavariate( alpha, beta)
  gauss( mu, sigma)
  lognormvariate( mu, sigma)

  amongst others, in place of random.random, check the Python random module documentation
0 Kudos
DanPatterson_Retired
MVP Emeritus
It appears that you can specify the type of random number generator to use since version 9.2
http://webhelp.esri.com/arcgisdesktop/9.3/index.cfm?TopicName=Random_number_generator
It is well hidden but still there, as shown in the image.  They have also added a section on distributions
http://webhelp.esri.com/arcgisdesktop/9.3/index.cfm?topicname=distributions_for_assigning_random_val...
0 Kudos
DanPatterson_Retired
MVP Emeritus
LOL!
Excellent summary!
At least a framework for further testing is available.
As Bill points out, none of the environment settings are available through the SA toolbar but are through Arctoolbox

the equivalent links for 10 are:
http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//001w0000000w000000.htm
http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//0017000000t5000000.htm
0 Kudos
AdamKilpatrick
Emerging Contributor
Good work Dan, I also just noticed this option in the environments setting of some tools in Arc Toolbox (I was actually looking at a different tool). So which should I be choosing...
STANDARD_C, ACM599, or MERSENNE_TWISTER ?

From a brief wiki search (I hang my head in shame), the Mersenne Twister option looks like its designed to work well...?

I noticed the option in the cost distance tool of SA. I've never noticed it before...but am now wondering how many tools use a random number generator in their implementation, and what is the implication of choosing the incorrect RNG type?
0 Kudos