POST

If you would like to put this analysis into a quantitative framework I would recommend stepping out to the free software Map Comparison Kit. For this problem, the two models to investigate are the Fuzzy Kappa and Fuzzy Weighted Kappa. The Kappa statistic will evaluate the proportion of change corrected for chance agreement and fuzzy approaches account for inherent uncertainty or error in the discrete process. There is just nothing like this available in the common suite of GIS software. The Weighted Kappa is of particular interest. The model allows you to define a transitional weights matrix that provides, equal, less or more, emphasis for certain change trajectories. An example of where this would be relevant is weighting transition from a certain vegetation class to urban higher than a transition to a different vegetation class. When I review manuscripts that evaluate change direction using classified data (eg., NLCD) one thing that I look for is if the authors evaluated the significance of their results or accounted for uncertainty. If not, I often find the results suspect because model error and random agreement are not accounted for. Another approach would be a ttest of the change but, you certainly need something providing statistical support. If this is a qualitative assessment, then I would not worry.
... View more
03032016
09:52 AM

1

0

3

POST

Since the bands are quantized and calibrated, this is very straight forward for Landsat 8. You will need a few band specific parameters from the scenes metadata. The scaling factors are standard so, once you have the bandspecific values they can apply to any scene. However, this does not apply to the sun elevation, which is scene specific. Mp = REFLECTANCE_MULT_BAND_x (bandspecific multiplicative rescaling factor) Ap = REFLECTANCE_ADD_BAND_x (bandspecific additive rescaling factor) SE = SUN_ELEVATION (sun elevation angle) Then, atsensor reflectance is just a matter of some simple raster algebra in the raster calculator. ( Mp * Qcal + Ap ) / sin(SE) where; Qcal = digital numbers (DN) raster of band n
... View more
03112015
12:21 PM

0

0

5

POST

A kernel density estimate is a fit distribution and is commonly referred to as a probability density function (PDF). If you transform the resulting PDF it will become somewhat meaningless.
... View more
01162015
08:54 AM

0

0

4

POST

Because it tends to stabilize the model across parameter space, standardizing your data is a common practice in this type of modeling. It is critical to apply coefficients to the range of model data and not the raw data. In other words, if the original model was standardized, then the coefficients are only applicable to the standardized data. It sounds like you are attempting to apply the coefficients to nontransformed data, which is not valid. As such, it may be prudent to contact the authors to find out exactly how they specified their model. The term "standardize" can unfortunately refer to multiple data transformations ranging from row standardization [x / max(x)] to centering the data to zero mean and a standard deviation of one, which sounds like what was done. If you can get the mean and sd for each covariate from the authors, then you can just apply a transformation on your subset data rather than having to acquire and calculate them on the full raster data.
... View more
07282014
10:44 AM

1

0

5

POST

Well, the problem sounds like your correlation approach is not working and forcing the range will not correct the problem. Also the type of correlation (Person, Kendall, Spearman) is important given the data. I would revisit your methodology to figure out why your correlation coefficients are off rather than trying to fix the issue post hoc. Any approach for deriving correlation coefficients at the raster cell level using a combination of ArcMAP and Excel are opaque at best and some details would be required to evaluate the approach. I am not clear on why you want these correlations as a raster. If you have well locations, with associated depth, you could just assign the gravitational anomaly raster values using the "Extract Multi Values to Points" tool, export the resulting attribute table to a flatfile (eg., csv) and calculate the correlation coefficient in Excel (not that Excel is good for statistical analysis). It is interesting that you mention this particular correlative relationship. I just published a paper in PLoS One (Evans & Kiesecker 2014) using gravitational anomaly data, along with other covariates, in a probabilistic model of nonconventional oil/gas. I used a nonparametric model and found that this particular relationship is highly nonlinear in nature. Because of this, any correlations are likely to be erroneous. I would imagine that it is time to move your analysis into a statistical software and it may also be a good point to consult a statistician. Evans, J.S., J.M. Kiesecker (2014) Shale Gas, Wind and Water: Assessing the Potential Cumulative Impacts of Energy Development on Ecosystem Services within the Marcellus Play. PLoS ONE 9(2): e89210. doi:10.1371/journal.pone.0089210 If you were to produce a surface of depth, using Kriging or any such interpolation method, you could easily calculate a moving window correlation surface in R. Here is an example, that includes a Kriging estimate, for calculating a moving window correlation surface. # Moving window correlation function # x raster of x # y raster of y # dist distance (radius) of correlation window, the default AUTO calculates a window size # ... additional arguments passed to the cor function mwcor < function(x, y, dist="AUTO", ...) { require(sp) require(spdep) if ( (dist == "AUTO") == TRUE){ cs < x@grid@cellsize[1] dist = sqrt(2*((cs*3)^2)) } else { if (!is.numeric (dist)) stop("DISTANCE MUST BE NUMERIC") } nb < dnearneigh(coordinates(x),0, dist) v=sapply(nb, function(i) cor(x@data[i,], y@data[i,], ...)) if( (class(v)=="numeric") == TRUE) { v = as.data.frame(v) } else { v = as.data.frame(t(v)) } coordinates(v) = coordinates(x) gridded(v) = TRUE ( v ) } # Example require(gstat) require(sp) require(spdep) data(meuse) data(meuse.grid) coordinates(meuse) < ~x + y coordinates(meuse.grid) < ~x + y # GRID1 log(copper): v1 < variogram(log(copper) ~ 1, meuse) x1 < fit.variogram(v1, vgm(1, "Sph", 800, 1)) G1 < krige(zinc ~ 1, meuse, meuse.grid, x1, nmax = 30) gridded(G1) < TRUE G1@data = as.data.frame(G1@data[,2]) # GRID2 log(lead): v2 < variogram(log(lead) ~ 1, meuse) x2 < fit.variogram(v2, vgm(.1, "Sph", 1000, .6)) G2 < krige(zinc ~ 1, meuse, meuse.grid, x2, nmax = 30) gridded(G2) < TRUE G2@data < as.data.frame(G2@data[,2]) # Moving window correlation surface gcorr < mwcor(G1, G2, 500, method="spearman") # Plot results colr=colorRampPalette(c("blue", "yellow", "red")) spplot(gcorr, col.regions=colr(100))
... View more
07232014
12:18 PM

1

1

28

POST

NDVI is a ratio between two bands. The equation to calculate the ratio on a single vector will result in all zero values. The equation you provide is a ratio in the upper and lower tails of the distribution and will result in a single value.
... View more
07232014
10:28 AM

0

0

28

POST

Are you sure that you want to do this? You are, in effect, creating a twotailed distribution where the negative tail is indicating a proportional negative response equal to the positive. If this is what you are after then you need to identify a "hinge point" that defines the inflection point, in the original distribution, indicating where the distribution will be centered on 0. This will provide relevant information on where the distribution changes to a negative influence. If the distribution is not centered then the result will be arbitrary. The equations for normalizing are not applicable here. This will not necessarily provide a bounded 1 to 1 range but the equation for standardizing a distribution to a standard deviation of 1 and a mean of 0 thus, providing a Zscore, is: [(x  mean(x)) / standard deviation(x)]. You can play with the math to center on a different value. There is a good reason that we normally scale distributions to a 01 range. The problem here is that a Zscore standardization assumes a Gaussian distribution, which you likely do not have. Overall I do not believe that this is a good idea. Perhaps if you provided some context I will be able to provide a relevant alternative. If you data represents a known fixed range (e.g., correlations coefficients) and for some reason exhibit erroneous values, I would recommend just truncating them using a con statement (eg., con(raster < 1, 1, raster) ).
... View more
07232014
10:24 AM

1

3

28

POST

You are being offered advice based on the "cult of high spatial resolution". Bigger is not always better. There are instance where high resolution data is quite appropriate and, since it is free, calibrated RedGreenBlueNIR, your first stop should be NAIP ( http://www.fsa.usda.gov/FSA/apfoapp?area=home&subject=prog&topic=nai ). Unless there is a notable difference in the photosynthetically active radiation between the vegetation types, a metric like NDVI will likely not provide adequate spectral separation. You need to base your imagery decision on the intended classification goals. There is a bit of art invloved in selecting appropriate imagery that represents a balance between spatial, spectral and temporal resolution. There are instances where, with the addition of the SWIR band, 10m SPOT 5 will outperform 1m Quickbird because the spectral discrimination is more appropriate in the shortwave IR range. If you have vegetation the is not readily separable in the visible and NIR range then high resolution data buys you nothing. I have been very impressed with the new Landsat 8 16bit data. If your problem can be addressed with moderate resolution (30m) data then this would be the option I would aim you towards. That said, the best satellite data I have worked with is WordView 2. The spatial resolution in the 8 multispectral bands is 1.85m and 46cm in the panchromatic. The addition of the rededge and blueedge bands provide amazing discrimination of many vegetation types. It is, however, quite expensive. You can acquire information on WorldView 2 as well as QuickBird, IKONOS and GeoEye on the DigitalGlobe website ( http://www.digitalglobe.com/ ). If you have an academic affiliation DigitalGlobe has a special pricing schedule and federal partners have free access through various sources.
... View more
06302014
10:40 AM

0

1

4

POST

The raster package predict function has error handling for unallocated factor levels so, the culprit must be cubist.predict crashing when presented with a new factor level in "new data". You may want to let the package authors know about this bug. You could easily demonstrate it by modifying the levels in one of the factorial covariates in "data" and backpredicting the model. I have some R tutorials on my Quantitative Spatial Ecology website that you may find interesting. In fact, I have a tutorial implementing a similar model but, using random forests (which I would highly recommend you looking into as an alternative to Cubist).
... View more
04092014
07:23 AM

0

0

6

POST

You forgot to explain details on the offensive raster. Is this a factorial variable in the model? If so, are all of the levels in the raster represented in the model? The GDAL bindings to the ESRI API have never been too stable so, you may want to convert your rasters to an "img" or "tif" format. Keep in mind that the authors of a given package write the appropriate predict function and then make it available, in the global environment through R's generic "predict" function. Sometimes the package developers do not do the best job at error checking their functions and bugs arise. In this case, not only do you have "predict.cubist" but also "predict.raster" interacting to make your life difficult. If they are handling NA or unallocated factors differently, R may throw an error. This is just speculation but as a first step I would check the match in levels between you raster and model. Now, lets trim down your code a bit. First, make sure that your object names do not conflict. There is a base function in R called "data" so you should never name an object the same as a existing function. As R searches the Global environment, this can cause bizarre problems in for/while loops. Think efficient! For example, if you have a number of objects to operate on think about a loop or lapply. Here is a loop for coercing columns into factors. for( i in c("georeg", "landscape", "wetlands", "landuse", "geology", "fgjord") ) { data[,i] < as.factor(data[,i]) } When possible use indexing, and the number of the index, rather than the name. for( i in c(1,2,4,6,8,10) ) { data[,i] < as.factor(data[,i]) } You are creating many unnecessary objects in reading your rasters. You can use wildcards to pass directory listing to stack. Say that your rasters are "img" format or the only files in the directory you could for example : var.stack < stack( list.files(getwd(), pattern="img$", full.names=TRUE) ) You could also create a vector of everything in the directory: predictors < list.files() And then index specific elements in the resulting vector: var.stack < stack( predictors[c(1,2,4,7)] )
... View more
04082014
12:51 PM

0

0

6

Online Status 
Offline

Date Last Visited 
3 weeks ago
