Classify Wetland Pixels with Deep Learning - Training Input Raster Questions

MadelineHayes1 · ‎03-27-2025

I'm trying to classify wetland pixels using deep learning (U-Net), following the documentation here. This guide suggests using ArcHydro's WIM tools in conjunction with the deep learning tools. Following this, I created rasters in my study area for Topographic Wetness Index, Depth to Water Index, and Curvature using the ArcHydro toolbox. These 3 rasters were output as 32-bit float by the tools. I composited the 3 rasters, also resulting in a 32-bit float tif. However, reading posts in this community suggest that the pixel depth for the input training raster should be 8-bit unsigned. This has led to the following questions:

Should my composite raster from the ArcHydro tools be 8-bit unsigned, or is that rule just for imagery? I also don't understand WHY imagery needs to be 8-bit for deep learning.
If I wanted to add more predictor variables to my composite, like 4-band NAIP imagery, is there anything in particular I should do/consider? Or should it work using the standard workflow? Using the standard workflow, I would clip the NAIP raster to the same extent as the 3 predictors (TWI, DWI, curvature), make sure the resolution is the same (3m), then composite the rasters normally (in the screenshot below, Extract_MD_N1 is the 4-band NAIP imagery). I'm not sure if this is feasible, combining satellite imagery with the 3 topographically derived indicators. I believe U-Net treats each band as a separate channel, so I don't see why it wouldn't be possible, but I haven't been able to find any examples. I have two questions stemming from this:
1. What should the pixel depth be? NAIP is 8-bit, the ArcHydro predictors are 32-bit float. According to this post, any training imagery should be 8-bit unsigned. Should I convert the ArcHydro predictors to 8-bit, or the NAIP 4-band image to 32-bit?
2. Assuming I now have a 7-band raster for my training data (b1=DTW, b2=TWI, b3=Curvature, b4=red band from NAIP, b5=green band from NAIP, b6=blue band from NAIP, b7=NIR band from NAIP) - should I treat this as multispectral imagery when calling the prepare_data() function? i.e. specifying "imagery_type"? I don't entirely understand this, but I think specifying the imagery_type preserves band weights from the pre-trained models used for transfer learning. If this is true, how would I write this into the code? Would something like this be correct, where 'u' is the "miscellaneous" bands from ArcHydro?
  1. data = prepare_data(data_path, batch_size=16, imagery_type='ms', bands=['u', 'u', 'u', 'r', 'g', 'b', 'nir'])

Thank you in advance!

ShivaniPathak · ‎04-06-2025

Hi @MadelineHayes1, regarding the use of 8-bit unsigned format, we've conducted several tests and found that rasters in this format work best for training deep learning models.

You can definitely create a composite that includes both NAIP bands and ArcHydro outputs. I recommend adding the NAIP bands first, followed by the ArcHydro outputs, all in 8-bit unsigned format.

For reference, here are two sample notebooks you can check out:

Let me know if you need further clarification!