Here's where I am in the module:
Now that you have split your data, you'll train your random forest classifier using the training data you have created.
rfco = RandomForestClassifier(n_estimators = 500, oob_score = True) rfco.fit(train_set[predictVars], indicator)
The test data is 90 percent of the United States coastal data that was not used to train the model, and will show the accuracy of your prediction.
seagrassPred = rfco.predict(test_set[predictVars])
test_seagrass = test_set[classVar].as_matrix() test_seagrass = test_seagrass.flatten() error = NUM.sum(NUM.abs(test_seagrass - seagrassPred))/len(seagrassPred) * 100
-------
Here's the last few entries in my Python log:
test_set = data.drop(train_set.index)
indicator, _ = PD.factorize(train_set[classVar[0]])
print('Training Data Size = ' + str(train_set.shape[0]))print('Test Data Size = ' + str(test_set.shape[0]))
Training Data Size = 1000
Test Data Size = 9000
rfco = RandomForestClassifier(n_estimators = 500, oob_score = True)rfco.fit(train_set[predictVars], indicator)
seagrassPred = rfco.predict(test_set[predictVars])
test_seagrass = test_set[classVar].as_matrix()test_seagrass = test_seagrass.flatten()error = NUM.sum(NUM.abs(test_seagrass - seagrassPred))/len(seagrassPred) * 100
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\brian\AppData\Local\ESRI\conda\envs\arcgispro-py3-clone\lib\site-packages\pandas\core\generic.py", line 5274, in __getattr__
return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'as_matrix'
I'm concerned it might be a file path problem that I can't do anything about.
I'm working in ArcPro from a Windows server 2012R2, so the actual file path to Documents may differ from what it would be on a stand-alone machine.
I'm new to Python but can sort of follow along.
Kathy Cappelli I saw on a related thread that you're the person responsible for this cool module.
Is this what you are following?
Predict Seagrass Habitats with Machine Learning | Learn ArcGIS
some bugs in the instructions... don't know if they have been corrected in the instructions
Dan,
Yes, the Predict Seagrass Habitats... module is what I'm working through, and I did find the thread you refer to prior to posting my question, but thanks anyway for sending it. In that case, the OP's question stemmed from naming an attribute 02 (with zero) instead of O2 (with letter O).
I'm having a different issue, with calling up the method, .as_matrix. I have since found this information, saying that as_matrix is no longer included in Python versions.
I think perhaps I need to learn how to replace as_matrix with .values() or .to_numpy() before I can proceed.
The deprecation of .values happened after the publication of the Seagrass module I'm trying to learn from.
Follow-up: I replaced the .as_matrix method with .to_numpy and got... a result. The accuracy estimates were exactly opposite of what they were supposed to have been using .as_matrix, so I would take a wild guess to say that .to_numpy might have reversed two columns in the matrix. By the end of the learning module, my map looked "right." Still, I'd like to know what happened between .as_matrix and .to_numpy.