"Too few records for analysis" error in forest-based classification although data exist

LuciaMVilallonga · ‎11-28-2022

I'm trying to run the Forest-Based Classification tool (Spatial Statistics) on data from different sources that I joined together and cleaned in Python externally. When I run the tool, it fails with the error "Too few records for analysis. This tool requires at least 20 feature(s) to compute results." What's confusing is that there are 781 rows in the data and all rows have many (>600) columns with values - ie, there is no row in the data where every column value is null. Despite this, the tool apparently had issues reading all 781 rows.

The screenshots attached show the full error message and part of the input I gave to the tool. The explanatory training variables are too many to list in full; there is a mix of categorical and numerical variables. I'm using ArcGIS Pro version 3.0.2.

One other thing that may or may not be relevant: I wasn't initially able to load the data in shapefile format. When I tried to import it, I got the error "failed to load data," although it would open and plot just fine in a Jupyter notebook and in QGIS. I finally got it to load in ArcGIS by exporting to a .gpkg instead, but now the tool fails on that layer. I don't know if the two issues are related.

Any help would be greatly appreciated! Thanks in advance.

Robert_LeClair · ‎11-30-2022

Hmmm...not an expert in this GP tool but one of the warning messages caught my eye about Geographic Coordinates being used with a meters measure. The units of Geographic is degrees so I wonder if this is causing the issue. One thing to try is to Project the dataset to projected coordinates such as State Plane or UTM. Does the tool operate differently?