Consider a labeled data set containing 100 data instances which are randomly partitioned into two sets A and B, each containing 50 instances. We use A as the training set to learn two decision trees T 10 with 10 leaf nodes and T 100 with 100 leaf nodes. The accuracies of the two decision trees on data sets A and B are shown below:
(a) Based on the accuracies shown in the table above, which classification model would you expect to have better performance on unseen instances?
(b) Now you’ve tested T 10 and T 100 on the entire dataset (A + B) and found that the classification accuracy of T 10 on the data set (A + B) is 0.85, whereas the classification accuracy of T 100 on the data set (A + B) is 0.87. Based on this new information and your observations from the table above, which classification model would you finally choose for classification?
Enjoy 24/7 customer support for any queries or concerns you have.
Phone: +1 213 3772458
Email: support@gradeessays.com