Hi there,
I have been running data through the Forest-based classifcation tool.
I am interpretting the statistics and was hoping to get some further information of what some of these mean. I have been through the online documentation/help docs and also watched a number of videos relating to the tool including "The Forest for the Tress:.." by Lauren Bennet et al, all these resources are excellent.
I was hoping to get an explanation of some of the statistics not covered in these resources including:
- The Model Out of Bag Errors MSE statistic?
- Top variable importance. Where does the Importance number come from?
- The MCC in Classification Diagnostics?
- With all these statistics in mind, and understanding they all give different metrics for performance, which statistics would give the best overall indication of the performance of the model? I notice below the 'Validation Data : Classifications Diagnostics' there is a quoted accuracy eg 'Median Accuracy of 0.865 was approx. reached at seed 401355.'
Thanks for your help.