ArcHydro Wetland Identification Model (WIM) with multi-classed groundtruth data

MalloryGill · ‎02-13-2023

Hi @GinaO_Neil and WIM users! I'm using the latest version of the archydro (3.0.50) WIM tools on groundtruth data that is multi-classed. When running the Preprocess Ground Truth Data, I'm checking that "features represent more than one target class" option and I'm unchecking the "class of areas outside of the ground truth features is known" option. The processed wetlands show up as expected with data classes of 0 and 1 and showing no data for the areas outside of my groundtruth data.

This is the groundtruth wetlands features (symbolized by class)

This is what they look like after preprocessing

The training raster looks good as well

I run into problems later in the workflow though - it seems like the 0 class of the raster gets merged with the no data section of the extents so my final predictions have 0 and 1 classes but the 0 class seems to reflect both the actual 0 class of data AND the no data areas. Once I've trained the model and built my composite raster here's what my prediction looks like

It predicts what I'd expect the 1 class to look like but the 0 class locations seem to merge with the no data areas from the groundtruth. Am I missing a setting somewhere or skipping a step?

Thanks!

GinaO_Neil · ‎02-13-2023

Hi @MalloryGill ! If I am understanding correctly: your final prediction raster is filling in No-Data areas with Data, and all of those filled in pixels appear to have a value of 0?

If so, this is somewhat expected behavior. The No-Data areas in prediction outputs represent a 50/50 "guess", since there are no predictor variables found to make meaningful predictions from. I've had my no-data areas render as both 0 or 1. There should be meaningful predictions made for the areas where the predictor variable raster provided data. One way to check this is to bring in the probability raster and confirm there are probability values where expected. You will likely see probability values of 0.5 in the no-data areas. I recommend using Extract By Mask with the prediction outputs to isolate the actual data areas and avoid confusion.

I hope that answers your question. Let me know otherwise.

Best,

Gina

View solution in original post

GinaO_Neil · ‎02-13-2023

Hi @MalloryGill ! If I am understanding correctly: your final prediction raster is filling in No-Data areas with Data, and all of those filled in pixels appear to have a value of 0?

If so, this is somewhat expected behavior. The No-Data areas in prediction outputs represent a 50/50 "guess", since there are no predictor variables found to make meaningful predictions from. I've had my no-data areas render as both 0 or 1. There should be meaningful predictions made for the areas where the predictor variable raster provided data. One way to check this is to bring in the probability raster and confirm there are probability values where expected. You will likely see probability values of 0.5 in the no-data areas. I recommend using Extract By Mask with the prediction outputs to isolate the actual data areas and avoid confusion.

I hope that answers your question. Let me know otherwise.

Best,

Gina

MalloryGill · ‎02-14-2023

Thanks Gina! This does make total sense. Consider it user error - I was misunderstanding how the No Data areas would be treated in the outputs. Initially I was expecting there to be "no data" pixels in the outputs as well as the predicted 0 and 1 classes, which in hindsight doesn't make sense, so your explanation set me straight. Thank you!

GinaO_Neil · ‎02-14-2023

Glad that cleared things up. I'll automate that extract-by-mask operation in future versions of the tools. Thanks!