What does a negative value of Mean Error indicate?

SirajMazumder · ‎06-19-2019

The problem is :

"I performed IDW for EMF levels from cell phone towers. The resulting Mean Prediction error in cross-validation is -0.0081."

I've read about Mean Prediction Error and RMS Prediction Error in IDW on ArcGIS websites. But nowhere is it mentioned about what conclusions to draw if the Mean Prediction Error has a negative value

DanPatterson_Retired · ‎06-19-2019

from here

Performing cross-validation and validation—ArcGIS Pro | ArcGIS Desktop

You want your predictions to be unbiased (centered on the true values). If the prediction errors are unbiased, the mean prediction error should be near zero. However, this value depends on the scale of the data; to standardize these, the standardized prediction errors give the prediction errors divided by their prediction standard errors. The mean of these should also be near zero.

and you have -0.0081... which is good. The explanation of what it is, is given in the quote from the link

SirajMazumder · ‎06-20-2019

Thanks for the answer. I did read the link a long time ago. What my confusion is the negative value of the 'mean error'. I don't understand what the '-' sign tells here about the model or data.

DanPatterson_Retired · ‎06-20-2019

it says nothing about the model or the data other than the absolute value is extremely close to zero which is good, but its value is to the left of 0.

SirajMazumder · ‎06-20-2019

Fine. Then does it mean I can report the results without any additional ifs and buts??

DanPatterson_Retired · ‎06-20-2019

I would be more concerned by how much your regression line deviates from the 1:1 line in your image.

You don't draw conclusions from a single value, you have to look at the distribution of the data you have and your derived relationships (regression equation, and R squared etc)

EricKrause · ‎06-20-2019

I agree with Dan. The Predicted vs Measured line is fairly flat, and the scatterplot is pretty noisy. Also, the values in the Error column are fairly large relative to the measured values (often above 50% of the measured value).

If I were you, I would be much more concerned about accuracy than bias. I don't see any reason to doubt that the model is unbiased, but these numbers and graphs are pretty typical when the data does not have very much autocorrelation or is very noisy.

SirajMazumder · ‎06-20-2019

Agreed. I too felt the issue of autocorrelation. Thanks once again

SirajMazumder · ‎06-20-2019

Thanks Dan for your answer.

EricKrause · ‎06-20-2019

The mean error is the average of all the cross validation errors. A positive error means that the predicted value is larger than the true value, and a negative error means that the predicted value is less than the true value. For unbiased models, the underpredictions should cancel out the overpredictions on average, and the mean error should be close to zero. Sometimes it will be little bit negative, sometimes a little bit positive, but if it is close to zero, you have evidence that the model is unbiased.

Your value literally means that on average, the cross validation predictions were 0.0081 lower than the true values. Since I can see in your screenshot that your measured values range from about 0.03 to 0.24, a value of -0.0081 is negligible. If the mean error were instead something like 0.1, that would be very concerning, as the level of bias would be as large as most of the measured values.