(a) Describe two potential advantages of regression trees over other statistical learning methods.
(b) When growing a regression tree using CART, two types of splits are considered. Describe these splits and provide an example for each.
(c) A regression tree has three types of nodes: the root node, internal nodes, and terminal nodes. Describe each node and explain how predictions are made using a regression tree.
(d) Large bushy regression trees tend to over-fit the training data. Briefly explain what is meant by over-fitting and under-fitting the training data using regression trees.
(e) The predictive performance of a single regression tree can be substantially improved by aggregating many decision trees.
i. Briefly explain the method of bagging regression trees.
ii. Explain the difference between bagging and random forest.
iii. Briefly explain two differences between boosted regression trees and random forest.