Unable to run Segment Mean Shift on output from Principal Component Analysis

MervynLotter · ‎07-30-2015

Hi there. Given that segmentation can only analyse 3 bands, I tried using Principal Component Analysis to select the first 3 bands from a Landsat 8 image that explain most of the variance in the imagery. I set the Principal Components Analysis output to 3 bands. When I tried loading this new Principal Components image into the Segment Mean Shift geoprocessing tool (I tried this in both ArcMap 10.3 and ArcGIS Pro 1.1), I get the same following error “ERROR 001734 Raster has no stats but needed for proper stretch.” For the life of me I have been unable to apply any form of raster statistics that will work. I am able to use this same Principal Component output image in other classification tools, such as Maximum Likelihood, ISO Cluster, etc.

Where am I going wrong?

larryzhang · ‎07-31-2015

M.,

As a part of learning curve on ESRI "object-based" image classification with you, the following is just raised for discussion:

First of all, even as the "object-based" approach, ArcGIS 10.3 /Pro have offered break-through improvements on hi-resolution image segmentation and classification (like 1-m GeoEye, 0.8-m GF-2), especially workflow and directly supporting any-3-band images for feature extraction...Refer to Understanding segmentation and classification—Help | ArcGIS for Desktop

However, if working on lower-resolution image for classification (like LANDSAT), either multiband (via a layer) or PC image can be used for segmentation and raster classification. Pls refer to ArcGIS Help 10.1

If possible, worth to try any 3-band LANDSAT image (which band used, mostly depending on the purpose) for raster classification in 10.3...

Secondly, the generalization and feature clean-up algorithms in ArcGIS 10.3/Pro, including the effective conversion from the classified raster into vectors, are still limited...

pls share your new findings, if any

JeffreySwain · ‎07-31-2015

Is your output from the Principal components 32 bit signed? When I have tested this workflow, I had to convert the output from the Principal Components into an 8 bit unsigned integer (value range from PC of 0-28). That 8 bit unsigned raster works in the segmentation mean shift. I am not sure if the issue is the 32 bit signed or what. I would recommend creating a Support ticket to address it, but the Copy Raster tool workflow should keep your analysis moving.

MervynLotter · ‎07-31-2015

Hi Jeffrey (and Larry)

I checked and the output of my Principal Component analysis is a 32bit signed raster. The help file, which Larry Zhang kindly provided the link to, does state that "The Segment Mean Shift tool accepts any Esri-supported raster and outputs a 3-band, 8-bit color segmented image..", well almost. Following your suggestion of using the Copy Raster tool, I tried converting the PC output to a 16-bit unsigned and that did not work either. The 8-bit unsigned does work as you suggest - thank you!

How does one log a Support ticket?

Is the Imagery and Remote Sensing group the correct place to ask a question about Principal Component Analysis?

Thanks

JeffreySwain · ‎07-31-2015

I saw that statement about the 'any raster' statement and wonder if that should extend to 32 bit signed raster or is more specifically any 8 bit/16 bit multispectral raster. I have used 8 bit unsigned and 16 bit unsigned rasters with multiple bands and not seen that error. I have never tried a 32 bit signed raster until today.

http://support.esri.com/en/

This is the address to login to your account and request a Support ticket. I would consider that route since they have the ability to log a bug report for you if this is judged to be a bug. Here on the forums, you can commiserate with other users and hopefully get someone on the Esri side to answer you, but through Support you will have your very own analyst to help you out.

larryzhang · ‎08-02-2015

From our practice, it looks:

With ArcGIS 10.3 under 64-bit machine /Windows 7 (64-bit), the GP tool "Segment Mean Shift" support PCA image in 8-bit and 16-bit, not 32-bit.

++++++++++++++

With any 'built-in' PCA image, certainly you can practice via the object-based classification in 10.3. However, it is doubt that the results (as classified raster) are accurate and reliable, in particular, for large resolution image or hyperspectral image.

One of the reasons is that PCA image is a type of 'built-in & interpreted' image, which is derived from many bands. This type of operations mostly shows loss of information. In fact, those lost information may be very important for your study case. For example, geologists may want extract "basalt rocks" from a multiband (hyperspectral) image. With the system built-in PCA algorithms, it couldn't help you to highlight those information related to basalt rocks.

So, the theory of PCA is correct, but you shouldn't fully rely on any built-in PCA. Inversely, you should make use of bands as many as possible, or interpret/ derive an image, which is based on your study purpose from multiband image ...

MervynLotter · ‎08-02-2015

Hi Larry

Thank you for this useful information. I was not quite sure how useful a

PCA analysis actually is to the remote sensing community, so good to hear

your thoughts. I am only an occasional user. It certainly makes sense that

if one was looking for a specific signature, or set range of sensors, then

this would* not* be a good approach to take. I guess it all depends on the

kinds of questions one hopes to answer. After running my Landsat 8 image

through a PCA, I admit I was most impressed with the visual output of the

first 3 axes. In my analysis I was looking at a proposed RAMSAR site and

the wetlands were really quite striking, more so than just looking at

individual bands, or combinations thereof. But I may have been lucky in

this particular instance as I am sure that the PCA output is very dependent

on the input imagery and area of interest in the analysis.

For the interest of others ... in species distribution modelling, running a

PCA is a great way of ensuring that the input layers are not correlated

(they should never be). The unfortunate result is that it then becomes

difficult to relate observed environmental drivers to the the actual

output, so it is near-impossible to state that, for example, minimum

temperature during the coldest month was an important variable driving a

particular species model. One can then only say that PCA Axis 5 was

informing the output of the analysis ... But if you want to maintain a

robust selection of environmental variables, then correct geoprocessing

tool to use in this case is the* Band Collection Statistics* tool, making

sure to tick the Compute covariance and correlation matrix option. This

would identify the extent to which various environmental layers are

correlated to one another and negate the need to use PCA. Bearing in mind

what Larry wrote, one may also loose some of the finer nuances within a

particular environmental variable layer if one runs a PCA - so I guess

best to use original layers.

Thanks for the feedback Larry, much appreciated.

JosephMcGlinchy · ‎08-06-2015

A couple of thoughts here.

Principal Components Analysis is very valuable for a relevant dataset. I say that because this type of analysis is typically performed on datasets with high degrees of correlation between data points, such as hyperspectral data cubes. Lower spectral resolution datasets, such as Landsat, may not have as much spectral correlation between bands and the result of a principal components analysis may not hold as much water, so to speak. That being said, it is typical that the first 3 components (bands) of a PCA will represent > 95% of the variability in the data.

If the output of the PCA tools gives you a data type that is not unsigned 8bit, you can convert it to 8 bit unsigned using the Copy Raster workflow as mentioned above, or by applying a Stretch raster function via the Image Analysis Window and specifying the output to be 8 bit unsigned integer. This result can then be used as input to the Segment Mean Shift GP tool / raster function.

Hope it helps!!

-Joe

larryzhang · ‎08-10-2015

Certainly, agreed with you, Thanks, Joe

+++++++++++

Personally, what some else could be highlighted here should be as follows:

As an image technique for dimension reduction and compression, the PCA /Kernel PCA (and others like Isomap, Diffusion Maps, Laplacian Eigenmaps, ICA, etc.) has been making good contributions to the remote sensing community, in particular, in the past when there were ONLY limited capabilities (in the memory, operating system and data storage) of computer system available.

With the advances of latest IT and satellite techniques, it is highly advisable (actually, highly required) that many ‘traditional’ image algorithms (like band reduction) in the remote sensing solution packages dealing with imagery processing and image analysis must be reviewed agian (even, re-evaluated and re-implemented), even though optimal band reduction is still widely used to avoid the computation overhead in majority of organizations, in particular, when dimension reduction algorithms and workflow are optimized case by case…

In fact, it is good to witness that imagery end users may have other choice to pick some of the innovative algorithms and effective workflows to directly analyze full bands of ‘big data’ images within HPC environment (High Performance Computing) without reducing image dimension, especially, while dealing with wide ranges of information extraction from high-resolution and hyperspectral images, especially, for many special information extraction…